Document 11198962

advertisement
Tools for Investigating Cellular Signaling Networks by Mass Spectrometry By Timothy Gordon Curran B.S. Computer Science California Institute of Technology (2008) B.S. Mechanical Engineering California Institute of Technology (2008) Submitted to the Department of Biological Engineering in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY June 2014 © Massachusetts Institute of Technology 2014. All Rights Reserved. Author: ………………………………………………………………………………………………………………… Department of Biological Engineering May 1, 2014 Certified by: ………………………………………………………………………………………………………… Forest M. White Department of Biological Engineering Thesis Supervisor Accepted by: …………………………………………………………………………………………………………. Karl Dane Wittrup Co-­‐Chair, Graduate Program Committee 2
This Doctoral Thesis has been examined by the following Thesis Committee: Forest M. White, Ph.D. Committee Member and Thesis Supervisor Professor of Biological Engineering Massachusetts Institute of Technology Bruce Tidor, Ph.D. Committee Chair Professor of Biological Engineering and Computer Science Massachusetts Institute of Technology Douglas Lauffenburger, Ph.D. Committee Member Professor of Biological Engineering Massachusetts Institute of Technology 3
4
Tools for Investigating Cellular Signaling Networks by Mass Spectrometry By Timothy Gordon Curran Submitted to the Department of Biological Engineering May 2014 in partial fulfillment of the requirements for the degree of Doctor of Philosophy Abstract Mass spectrometry has become the tool of choice for proteomics. Its unrivaled coverage
and reproducibility has positioned it head and shoulders above competing techniques for
analyzing protein expression post-translational modification. With the increased
popularity comes a flood of new research applications, each with its own biological
motivations and goals. To ensure that mass spectrometry-based proteomics can be useful
to as many biological questions as possible, it is of utmost importance to ensure high data
quality. This research focuses on two general stages of the typical proteomics workflow
and introduces tools to facilitate effective target screening, follow-up analysis, as well as
more precise measurements. This new pipeline is then demonstrated in a case study of
Epidermal Growth Factor Receptor (EGFR) signaling and phenotype prediction.
The quantity of proteomic mass spectrometry data available from a single analysis has
increased exponentially as new generations of instruments become quicker and more
sensitive. This deluge of data leaves many tempted to forego time-intensive manual
validation of database identified targets in favor of global data set quality statistics.
Particularly in the realm of post-translational modifications, long lists of putative matches
are often reported with little or no scan-specific validation. Such practices no longer
provide assurance that any single identified target is indeed correct, leaving researchers
vulnerable to expending vast resources chasing false positives. The argument is that
manual validation is too time-intensive to be carried out for each and every identification.
To remedy this problem we have introduced the Computer Assisted Manual Validation
(CAMV) software package to expedite the procedure by preprocessing the database
results so as to remove the tedious steps associated with the validation task and only
recruit human judgment for the final quality decision. This approach has drastically
5
decreased the time required for manual validation; a task that used to take weeks now is
completed in hours.
Another focus of this research is the development of a multiplex, multisite absolute
quantification method, which has improved the quality of quantitative proteomic mass
spectrometry data. Absolute site-specific data allows many more biological hypotheses to
be directly tested with a single mass spectrometry experiment, including phosphorylation
stoichiometry. This technique has been applied to the EGFR system to better understand
signaling downstream of three distinct ligands. These ligands all bind the same receptor
yet elicit different phenotypes, suggesting differential information processing. The
analysis showed unique patterns of receptor phosphorylation present following subsaturating ligand treatment. However, at saturating doses the same pattern of
phosphorylation is produced regardless of ligand, but the magnitude of that pattern is still
ligand-dependent. In this regime, the adaptor proteins were still able to retain ligandspecific phosphorylation patterns presumably responsible for differential phenotypes. The
data set also permitted the identification of signals important for the regulation of only
one of the two phenotypes examined.
Thesis Supervisor: Forest M. White
Title: Professor of Biological Engineering
6
Acknowledgments I am very fortunate to have found such a great fit in the Department of Biological Engineering. This is in no small part due to the mentorship of my advisor Forest White. In his lab, I have progressed from a pure engineer with a computational focus into a more competent scientist capable of designing and executing my own experiments. His willingness to let me develop methods and explore new areas has made my time at MIT a fantastic learning experience. My thesis committee members, Douglas Lauffenburger and Bruce Tidor, have provided invaluable advice throughout my graduate career. I consider myself very fortunate to have such accomplished mentors guiding my scientific growth. Members of the White Lab, past and present, have been instrumental at many stages of this work. Their insight, experience, and advice along the way have positively influenced every aspect of my projects. When I first joined the lab, senior graduate student and post-­‐docs took the time to train me in laboratory techniques ranging from basic to sophisticated. During the formulation stages of my project, their experience and knowledge helped me avoid pitfalls and develop more informative experiments. It is also the members of the White Lab who were the test subjects for early versions of the spectral validation software. Their constructive feedback has helped to improve CAMV to level that it exists today. The MIT community has provided a fantastic environment in which to learn and live. The BE class of 2008 were some of the first people that I met at MIT and they continue to be among my closest friends. The Sidney-­‐Pacific Graduate Community has been another rich source of inspiration throughout my graduate career. The strong sense of community fostered by the strong student government provided me the opportunity to grow as a leader and make lifelong friends along the way. The success of this community is in no small part due to the wonderful housemasters (Roger and Dottie Mark) and associate housemasters (Roland Tang and Annette Kim), whom I got to know very well through my involvement in SPEC. Finally, my friends and family have provided strong support and inspiration. My family ignited my interest in science and engineering from a very young age and gave me the opportunity to pursue it at the highest levels. My friends, from MIT, Caltech, and beyond, have always been there to supply encouragement and much-­‐needed perspective; for that I am supremely grateful. Last, but not least, my fiancée Chelsea has been an invaluable source of love and support throughout our time together. 7
8
Table of Contents Abstract ....................................................................................................................................... 5 Acknowledgments ................................................................................................................... 7 List of Figures ......................................................................................................................... 11 List of Tables ........................................................................................................................... 13 1 Introduction ..................................................................................................................... 15 1.1 General Cancer Background ............................................................................................ 18 1.2 Breast Cancer Specific Background ............................................................................... 19 1.2.1 TKI Inhibitors as Breast Cancer Treatments ...................................................................... 20 1.3 EGFR/HER2 as Target Signaling Pathway ................................................................... 21 1.3.1 Phenotypes ....................................................................................................................................... 23 1.3.2 Ligands ............................................................................................................................................... 23 1.3.3 Key Downstream Signaling Nodes .......................................................................................... 26 1.4 Mass Spectrometry .............................................................................................................. 38 1.4.1 MSMS Peptide Analysis ............................................................................................................... 39 1.4.2 Sample Purification (IP/IMAC) ................................................................................................ 39 1.4.3 Instrument Types ........................................................................................................................... 40 1.4.4 Analysis Types ................................................................................................................................. 47 1.4.5 Labeling Techniques ..................................................................................................................... 50 1.5 Absolute Site-­‐Specific Quantification ............................................................................ 51 1.5.1 Motivation for Absolute Quantification ................................................................................ 51 1.5.2 Previous Methods .......................................................................................................................... 51 1.6 Bibliography .......................................................................................................................... 55 2 Computer Assisted Manual Validation (CAMV) ................................................... 81 2.1 Introduction .......................................................................................................................... 81 2.1.1 Statistical Approaches to Assessing Quality ....................................................................... 83 2.2 Manual Validation of MS/MS Data .................................................................................. 90 2.3 Computer Aided Manual Validation Package ............................................................. 91 2.3.2 Exemplary Application to Tyrosine Phosphorylation Analysis ................................. 99 2.4 Conclusion ............................................................................................................................ 102 2.5 Appendices ........................................................................................................................... 103 2.5.1 Package Availabilty .................................................................................................................... 103 2.5.2 Version Information .................................................................................................................. 103 2.5.3 Mascot Search Parameters ...................................................................................................... 103 2.6 Bibliography ........................................................................................................................ 104 3 Multiplex Absolute Regressed Quantification of Internal Standards (MARQUIS) ............................................................................................................................. 107 3.1 Introduction ........................................................................................................................ 107 3.2 Methods ................................................................................................................................. 109 9
3.2.1 Multiple Reaction Monitoring (MRM): ................................................................................. 110 3.2.2 Parallel Reaction Monitoring (PRM): ................................................................................. 111 3.3 Results ................................................................................................................................... 112 3.3.1 Assessing the requirement for standard peptides for absolute quantification of tyrosine phosphorylation ....................................................................................................................... 112 3.3.2 Multiplex Absolute Regressed Quantification of Internal Standards (MARQUIS)113 3.3.3 Performance of the method .................................................................................................... 115 3.3.4 Isolation Width Effects ............................................................................................................. 119 3.3.5 Absolute Quantification of Tyrosine Phosphorylation in the EGFR signaling network .......................................................................................................................................................... 120 3.4 Conclusion ............................................................................................................................ 124 3.5 Acknowledgments ............................................................................................................. 127 3.6 Appendix Tables ................................................................................................................. 128 3.7 Bibliography ........................................................................................................................ 135 4 EGFR Signaling Pathway Downstream of Ligands ............................................. 137 4.1 Introduction ......................................................................................................................... 137 4.2 Methods ................................................................................................................................. 139 4.2.1 Migration Assay: .......................................................................................................................... 139 4.2.2 Proliferation Assay: .................................................................................................................... 139 4.2.3 Mass Spectrometry ..................................................................................................................... 139 4.3 Results ................................................................................................................................... 142 4.3.1 MCF10A 24-­‐hour Phenotypes ............................................................................................... 142 4.3.2 LC-­‐MS Absolute Quantification ............................................................................................. 143 4.3.3 EGFR Pattern of Phosphorylation ........................................................................................ 144 4.3.4 Adaptor Protein Patterns of Phosphorylation ................................................................ 146 4.3.5 PLSR Predicts TGFα Phenotypic Responses from AREG and EGF Data .............. 147 4.3.6 PLSR to Model Phenotypes Predicts Stronger ERK2 pY187 Contribution to Proliferation .................................................................................................................................................. 148 4.4 Conclusion ............................................................................................................................ 149 4.5 Appendix Figures ............................................................................................................... 152 4.6 Appendix Tables ................................................................................................................. 154 4.7 Bibliography ........................................................................................................................ 157 5 Conclusions and Future Directions ........................................................................ 159 10
List of Figures Figure 1-­‐1: Quadrupole Potential and Ion Dyanamics.. .......................................................................... 45 Figure 1-­‐2: a) Stable regions of U-­‐V space for a particular m/z.. ....................................................... 46 Figure 2-­‐1: Sample complexity prevents identification of peptides from complex samples based solely on accurate precursor mass measurements. .......................................................... 82 Figure 2-­‐2: MASCOT score only provides a rough estimate of the quality of the spectral assignment. ...................................................................................................................................................... 85 Figure 2-­‐3: Variation in percentage of unassigned peaks for a given MASCOT score can also be visualized at the dataset level. ........................................................................................................... 86 Figure 2-­‐4: Examples of spectra that were ultimately accepted into the data set. ..................... 87 Figure 2-­‐5: Exemplar yellow identification where the user changed the identification made by the computer. ............................................................................................................................................ 88 Figure 2-­‐6: Exemplar red identification where the user rejected an identification initially included by the computer. ......................................................................................................................... 89 Figure 2-­‐7: The Graphical User Interface (GUI) for the MATLAB implementation of CAMV. 92 Figure 2-­‐8: Peptide Sequence Identification and Validation Workflow. ......................................... 93 Figure 2-­‐9: All peptide matches following user validation of the dataset. .................................. 100 Figure 2-­‐10: Distribution of user and CAMV decisions versus MASCOT score. ........................ 101 Figure 3-­‐1: Variability of Signal Intensity from technical replicates. ............................................ 112 Figure 3-­‐2: Peptide Specific Signal Intensity. ........................................................................................... 113 Figure 3-­‐3: MARQUIS Method for Absolute Quantification. .............................................................. 115 Figure 3-­‐4: MRM Quantification of endogenous tyrosine phosphorylated peptides. ............ 116 Figure 3-­‐5: Details of PRM Data Collection Method. ............................................................................. 117 Figure 3-­‐6: Quality Comparisons. .................................................................................................................. 119 Figure 3-­‐7: Q1 isolation Width Comparison. ............................................................................................ 120 Figure 3-­‐8: Stoichiometric comparison of site-­‐specific phosphorylation. .................................. 121 Figure 3-­‐9: Ligand Comparison. ..................................................................................................................... 124 Figure 4-­‐1: Ligand-­‐Specific Phenotypic Response. ................................................................................ 143 11
Figure 4-­‐2: Full-­‐Scan MARQUIS method for absolute quantification. ........................................... 144 Figure 4-­‐3: Ligand-­‐Specific EGFR Phosphorylation. ............................................................................. 145 Figure 4-­‐4: Ligand-­‐Specific Adaptor Phosphorylation. ....................................................................... 147 Figure 4-­‐5: Partial Least Squares Regression Predicts TGFα-­‐induced Migration and Proliferation. ................................................................................................................................................. 148 Figure 4-­‐6: Proliferation is more sensitive to UO126 treatment than migration. ................... 149 Figure 4-­‐7: Similar EGFR Phosphorylation Pattern at High Ligand Dose. .................................. 152 Figure 4-­‐8: Phenotypic prediction PLSR models. ................................................................................... 153 Figure 4-­‐9: ERK2 pY187 Contributes More Strongly to Predicting Proliferation than Migration. ....................................................................................................................................................... 153 12
List of Tables Table 3-­‐1: MRM Quantification of endogenous tyrosine phosphorylated peptides. .............. 128 Table 3-­‐2: PRM Quantification of endogenous tyrosine phosphorylated peptides. ............... 129 Table 3-­‐3: Comparison of MRM and PRM quantification for 13 tyrosine phosphorylated peptides. ......................................................................................................................................................... 131 Table 3-­‐4: Ligand-­‐Specific EGFR Absolute Quantification. ................................................................ 133 Table 4-­‐1: iTRAQ labeling scheme. ............................................................................................................... 140 Table 4-­‐2: Site Specific Absolute Quantification Data. ......................................................................... 154 13
14
1
Introduction Engineering design principles teach that “form follows function.” However, evolution
does not directly have that luxury. Instead, particular design iterations are arrived at by
chance and the success of that design in self-replication determines its longevity. The
results of this method of design are seen today in every aspect of biology. An example
relevant to this thesis is that of environmental sensing and decision-making based on the
observation. Inherently, this requires the concept of outside and inside – which already
skips over many iterations of blind-search optimization required to effectively separate
space for an organism’s own purposes. Once this partitioning has been accomplished, the
organism’s ability to taste the waters of its environment requires a molecule capable of
transmitting this information across the barrier. This is where cellular receptors come into
play. There are many varieties that sense anything from chemicals to neighboring cells.
Many times, the information about the extracellular environment is registered through a
binding event to the extracellular portion of the receptor molecule, which drives a change
to the intracellular portion. For the purpose of this thesis, the intracellular modification
that occurs will be the addition of a group to the side chain of the protein receptor. In
particular, the addition of a charged phosphate group to the previously uncharged side
chain of a tyrosine residue1.
The trick that biology must figure out is how to use this capability to create an advantage.
In this case, the advantage comes in the form of information transfer that, when properly
read, can clue the cell in to the best course of action. Such an adduct can serve a number
of functions including activation/deactivation of the protein’s function, change in the
protein’s localization2, or completion of a binding site for a protein complex3. Protein
phosphorylation is most common type of post-translational modification (PTM) used for
signal transduction4.
The problem with any engineering advance is that no good deed goes unpunished. The
clever receptor system that the cell has developed to sense the extracellular environment
and inform its decisions is susceptible to the same random alterations from which its
functions were first created. While the cell has compensatory mechanisms with which to
combat some changes, others are just too much for it to contain. The carefully tuned
15
levels of different proteins can be thrown askew by a virus that overexpresses a
component in the signal transduction pathway (v-src)5. The optimized receptor, which is
only fully functional when ligand is present, can be truncated to a constitutively active
form (EGFR VIII)6. It is these sorts of examples that make cancer a disease of entropy. A
beautifully orchestrated system can be corrupted from the inside by random mutations to
the very same constituents that have served the cell so well. In some sense, it is merely a
continuation of the evolutionary process. The one cell that gains the ability to divide
without bound will reproduce more than its neighbor. The clonal population can then
overrun normal cells to the detriment of the tissue and organism to which it belonged.
It is precisely this scenario that makes cancer so hard to treat. The offending growth is the
result of mutations to otherwise normal cells from the very same tissue. The growth
initially maintains many characteristics identical to the original tissue making it difficult
to selectively remove. As more and more controls are lifted, the growth can spread and
even send out landing parties to other tissues in the body and colonize distant sites. Many
researchers seek to understand the steps to this process so as to develop pharmaceutical
inhibitors. The complication is that each cancer is different. Different subclasses of
tumors in the same tissue may behave very differently depending on the cell type from
which they originated, which mutation they exploit, and a host of environmental factors
not completely understood.
The work in this thesis falls at the very early stage of this whole process. It works from
the idea that in order to effectively treat such a complex disease, one must first
thoroughly understand the system. Only then can an effective therapy be envisioned to
treat the corrupted system. The system of choice is the Epidermal Growth Factor
Receptor (EGFR) signaling pathway, which senses extracellular presence of growth
factor ligands, translates those signals into phosphorylation events, and recruits/activates
signaling effectors that ultimately alter transcription. These newly activated genes are
responsible for cancer relevant phenotypes including growth, differentiation, and
migration.
Studying the proteins in this network at such a detailed level requires tools capable of
specific and reliable measurement. Developments in mass spectrometry have provided
16
just such a tool, which is capable of analyzing protein expression levels and more
recently post-translation modifications simultaneously for many proteins of interest.
These snapshots of cellular proteomic state provide a broad view giving investigators
insight into many coincident processes while avoiding the myopic focus on a single
aspect of cellular function. Development of these tools has deluged researchers with new
data for which verification and analysis is time consuming and often non-existent.
To take full advantage of the capabilities of mass spectrometry-based proteomics, data
quality must be a top priority. This work approaches data quality on two fronts: 1)
ensuring the quality of proteomic identifications made from exploratory massspectrometry experiments, and 2) designing more informative targeted experiments for
answering biological questions and building generalizable models. The first is motivated
by the fact that to be ultimately useful, large data sets are often shared between
researchers through massive online repositories7. Widespread adoption of such a resource
will only occur if the data is reliable and useful. Towards this end, database search
engines provide heuristic quality measures along with their identifications8,9, but
subsequent verification of those identifications can reveal underpredicted error rates
especially involving post-translational modifications10. The second is motivated by utility
maximization. Many mass spectrometry-based studies set out to catalogue rather than to
explore biological implications. The utility of such a catalogue is limited, because little
information about biological involvement can be directly derived. Multiplex labeling
techniques have spurred comparison studies between biological conditions giving
researchers more flexibility in their experimental design. Data from such experiments
have informed computational models effectively predicting cellular behavior11. Adding
absolute measurement to mass spectrometry-based proteomics has been attempted to
varying degrees12,13 with the overall goal of better understanding the stoichiometry
present between proteins and PTMs. High throughput abundance measurements open the
door to a whole new class of pertinent comparisons, further extending the biological
applications for mass spectrometry-based proteomics.
17
1.1 General Cancer Background In many ways, cancer is a disease of entropy. Slowly but surely throughout a lifetime
many small random genetic and epigenetic alterations occur in cells throughout the body
– a slow descent into disorder. Biology has many mechanisms of correcting such
mistakes, so most of these changes are caught early and rectified. Many of the others are
fatal, effectively selecting against the cells that harbor them. However, every once in a
while, one of these changes is not fatal and persists through to future generations of cells.
When one of these changes confers a selective advantage, the harboring cell may grow
faster or proliferate more readily. These cells take resources of neighboring normal tissue,
ultimately leading to hyperplasia and, eventually, the formation of a cancerous lesion and
the potential for metastasis to other sites in the body.
Populations in the developed world live longer by eradicating other preventable causes of
death, but have little or no way to prevent the inevitable accumulation of changes that
ultimately lead to cancer. According to 2008 GLOBOCAN data, there were 12.7 million
cancer cases and 7.6 million cancer deaths during that year14. Of these, 56% of cases and
64% percent of deaths occurred in the developing world. This leaves 44% to occur in
populations of the developed world, making their cancer incidence nearly twice as high15.
Breaking the diagnosis data down, lung cancer is the most commonly diagnosed cancer
worldwide with 1.61 million new cases (12.7% of total), perhaps due to behavioral
factors such as tobacco use. Second most commonly diagnosed is breast cancer with 1.38
million (10.9% of total), making it far and away the most commonly diagnosed in women
(23% of all diagnoses); more than colorectal and cervical cancer combined. It is not
surprising that breast cancer is also the leading cause of cancer death in women (14% of
deaths). Third most common is colorectal cancer with 1.23 million cases (9.7% of total).
Combining both sexes, the top three most common causes of cancer death are lung cancer
(1.38 million, 18.2%), stomach cancer (738,000, 9.7%), and liver cancer (969,000,
9.2%)14.
These sorts of population-level statistics convey the pervasive nature of cancer as a
global disease, but do little to explain the underlying mechanisms at work. A very
comprehensive review of our understanding of cancer development was compiled in
18
2000 by Hanahan and Weinberg in their landmark paper detailing the “hallmarks of
cancer”16 and they have recently released an updated version that expands upon their
original work17. They revisit the six original hallmarks (1. Sustaining proliferative
signaling, 2. Evading growth suppressors, 3. Activating invasion and metastasis, 4.
Enabling replicative immortality, 5. Inducing angiogenesis, and 6. Resisting cell death)
and expand the list to include two new emerging hallmarks (1. Avoiding immune
destruction, and 2. Deregulating cellular energetics) and two enabling characteristics (1.
Tumor-promoting inflammation, and 2. Genome instability and mutation). As the title
suggests, the authors do a very thorough job of outlining how each of these individual
characteristics plays a crucial role in permitting the unchecked proliferation and survival
of tumors. Their work provides a fantastic cellular and molecular biological view of
cancer that is so often lost in population-level statistics.
1.2 Breast Cancer Specific Background Breast cancer is the most common cancer in women, both in the developed and
developing world, accounting for 10% of all newly diagnosed cancers: 1.1 million cases
are diagnosed and 410,000 patients die each year18. These staggering numbers hide the
fact that breast cancer is an intensely heterogeneous disease. In a 2011 review, five
classifications were presented: Luminal A (ER+, PR+/-, HER2-), Luminal B (ER+, PR+/-,
HER2+), Basal (a.k.a. “Triple Negative” ER-, PR-, HER2-), Claudin Low (ER-, PR-,
HER2-), and HER2 (ER-, PR-, HER2+); along with traditional models that fall into these
classes19. Each class comes complete with its own unique treatment susceptibility and
patient prognosis. A 2006 survey of 51 breast cancer cell lines sought to catalogue
genomic and molecular characteristics of breast cancer cell lines20. The work showed
that, although breast cancer can be a remarkably heterogeneous disease, common themes
in genomic and proteomic abnormalities do arise. They concluded that genomic
heterogeneity and copy number abnormalities were similar between established cell lines
and primary tumors, but cell lines tend to carry more aberrations and higher levels of
overexpression than their primary tumor counterparts20, perhaps this is due to bias in cell
line establishment towards late stage disease20,21. Basal cell lines also show segregation
not seen in primary tumors22, which may be due to contamination in primary tumor
samples or selection bias during culture20. To further complicate the picture, some cell
19
lines previously characterized as breast cancer are now of questionable lineage. For
example, MDA-MB-435, previously labeled a spontaneously metastatic breast cancer cell
line, may actually be derived from a melanoma metastasis19.
This work considers the disease at its earliest stages, when molecular signaling events
reflect cells that are almost normal having only undergone the initial steps toward cancer.
After multiple genetic changes have led cells too far astray, the hope of treating their
signaling abnormalities becomes unlikely. For this reason, the MCF10A cell line has
been chosen. MCF10A cells are epithelial in nature, diploid, and derived from
spontaneously immortalized cells in long term culture of fibrocystic cells removed during
a mastectomy23. They have high EGFR surface expression (~100,000 copies/cell) making
them receptive to a family of EGFR-specific ligands known to drive phenotypically
cancerous behavior. MCF10A cells are known to migrate differentially in response to
these ligands24, which provides a convenient downstream readout on cellular behavior.
Numerous studies have argued for25 and against26,27 the prognostic value of EGFR status
in breast cancers. EGFR gene amplification has been reported to occur in 6-18% of breast
tumors28,29 and co-occur with EGFR protein overexpression28. EGFR overexpression also
alters the chances that a breast tumor will overexpress HER2; raising the likelihood from
16% to 26%29. The MCF10A cell line also does not overexpress HER2, which provides
an attractive background to investigate the effects of this particular genetic manipulation
common to many breast cancers.
1.2.1
TKI Inhibitors as Breast Cancer Treatments Small molecule tyrosine kinase inhibitors are the cornerstone of intracellular
pharmaceuticals targeted at decreasing EGFR/HER2 driven signaling30,31. Molecules
such as Gefitinib are ATP analogs seek to inhibit EGFR activity by blocking the kinase’s
ATP-binding pocket, thus preventing phosphate transfer to the substrate32. Gefitinib
(a.k.a. Iressa, ZD1839) was designed to bind specifically to EGFR with high affinity
(IC50 ~30nM), but not to HER2 (IC50 ~10µM) or other kinases in the EGFR signaling
pathway (Raf, MEK-1, or MAPK where IC50 >10µM)33. Preclinical studies with
tamoxifen-resistent cells show that 1µM Gefitinib treatment corresponded to loss of
EGFR phosphorylation as well as decreased HER2 phosphorylation32. The same cells
20
also show drastic decreases in MAPK phosphorylation and cell growth when treated with
Gefitinib34. Lapatinib (a.k.a. Tykerb, GW 2016) is a dual specificity tyrosine kinase
inhibitor designed to bind both EGFR (IC50 10.2nM) and HER2 (IC50 9.8nM) with
growth inhibition IC50 values <160nM for EGFR- and HER2-overexpressing cell lines35.
1.3 EGFR/HER2 as Target Signaling Pathway In 1962, Stanley Cohen reported that purified extracts from mouse salivary glands could
induce gross anatomical changes far beyond those previously observed for nerve cells.
Treatment with the extract promoted precocious eye opening and tooth eruption in
newborn mice36. He was able to isolate a factor with demonstrable biological effects that
he termed “tooth-lid factor”. In the decades that followed, tooth-lid factor was renamed
Epidermal Growth Factor (EGF) and its cellular membrane receptor was identified37,38.
Upon treatment of A-431 cells with EGF the EGF Receptor (EGFR) was observed to
become tyrosine phosphorylated and the kinase competent EGFR complex was shown
capable of phosphorylating the purified receptor39. Putting these facts together, Cohen
and King proposed that EGFR was capable of both extracellular ligand binding and
intracellular kinase activity37 giving EGFR the distinction of being the first Receptor
Tyrosine Kinase (RTK). Not long after, Ullrich et. al confirmed close sequence
homology between EGFR cDNA and v-erb-B by cloning the complete EGFR cDNA40,
building upon prior partial sequencing work41. In the same year, the C-terminal tyrosines
were confirmed as autophosphorylation sites and some were shown to be lacking in the
kinase-dead v-erb-B due to truncation42. In 1989, Honneger et. al produced evidence that
receptors phosphorylated each other following ligand stimulation. Using a series of
kinase-inactive and ligand-binding-incompetent EGFR mutants, the authors showed that
such mutants still became phosphorylated – implicating them as substrates of competent
endogenous EGFR molecules in the same cells43. The line of work leading to these
discoveries provides several key insights for what is now known about EGFR activation.
A model emerged of a surface receptor capable of transmitting information about the
cell’s environment through the membrane by phosphorylating key residues on its own
intracellular domain.
21
By 1990, the family of growth factors that bound to EGFR had grown to include, among
others, Transforming Growth Factor α (TGFα) and Amphiregulin (AREG)44. EGF45 and
TGFα46 had both been synthesized completely facilitating large scale production and
thereby accelerating further study. NMR studies of these ligands confirmed very similar
structures, including placement of six key cysteine residues forming three disulfide
bonds, between the two molecules EGF47,48 and TGFα49,50.
The exact order of ligand binding and dimerization events is still in debate. It is easiest to
picture a situation where a ligand binds a receptor, taking it to an active conformation
capable of dimerization with another ligand-bound receptor. This simplistic image is
challenged by concave-up Scatchard plots suggesting low- and high-affinity binding sites
on EGFR and the computational models that attempt to explain them51. Recent work
provides a mechanistic explanation: negative cooperativity in EGF binding52, which
requires that trimers (one ligand, two receptors) exist. These trimers do not bind an
additional EGF ligand with the same affinity as the first. Structural evidence of trimers as
well as asymmetric tetramers (two ligands, two receptors) exists for drosophila EGFR
(dEGFR) binding its ligand (Spitz)53. The authors posit that differential ligand affinities
may stabilize different receptor-ligand configurations, which ultimately explain the
phenotypic differences observed following ligand treatment53.
The juxtamembrane helices of EGFR also play a role in the dimerization process. Initial
motivation for this connection was driven by the observation that a single HER2 pointmutation was associated with cancer54, but studies mutating analogous residues in EGFR
have not been shown to affect EGFR function55. Recent work with bipartite tetracysteine
reagents have detected ligand-specific structural differences in these juxtamembrane
helices56, suggesting a mechanism for differential network activation and phenotype
manifestation.
The complexity of EGFR signaling is not constrained to the ErbB family. As early as
1986, studies were under way to test the cross-reactivity of receptor tyrosine kinases,
including EGFR. The studies showed that EGF-treated EGFR was capable of strong and
specific tyrosine phosphorylation of progesterone receptor (PR), while platelet-derived
growth factor receptor (PDGFR) did not57. Further results of this same study showed that
22
the insulin receptor (IR) was also capable of phosphorylating only one of the two PR
subunits, thus lending additional weight to the hypothesis that cross-talk between
receptors was indeed a controlled process that carried the potential for additional
information encoding.
In the years since EGFR’s initial characterization, a more comprehensive view of signal
transduction initiation has been constructed. Ligand binding to the receptor stabilizes the
‘open’ conformation of the extracellular domain of the receptor and facilitates receptormediated dimerization58-60. In the absence of ligand, the receptor extracellular domain is
in a ‘tethered’ conformation prohibiting dimerization58,61,62. Once dimerization has
occurred, the intracellular kinase domains of the two receptors are in close enough
proximity to reliably transphosphorylate the C-terminal tails58. The completed pTyr
motifs then provide docking sites for proteins containing SH2-domains. Jones et. al
extensively tested known SH2-domains from putative interacting adaptor proteins to
better understand the complexity of the second-layer signaling events63. These recruited
adaptor proteins provide their own set of motifs that can further recruit additional
proteins such as kinases, phosphophatases, and other critical signaling proteins indirectly
to the phosphorylated receptor.
1.3.1
Phenotypes EGFR sits atop a signaling network that ultimately controls several cancer-relevant
phenotypes such as proliferation64,65, migration66,67, and differentiation68. These key
phenotypes are all the result of ligand binding events at the plasma membrane, yet the
intermediate signal processing leads to ligand-specific phenotypic responses24,66. A very
nice study elucidating similar ligand-dependent behavior in the context of receptor
trafficking was carried out by Roepstorff et. al and showed that different EGFR ligands
were capable of driving differential fates for EGFR69. The connections they make
between ligand properties and upstream signaling events provide the tip of the iceberg for
understanding the ultimate behavior of the overall cellular phenotypic response.
1.3.2
Ligands Of the eight ligands known to bind EGFR, this work focuses on three ligands that have
been shown to bind exclusively to EGFR and no other HER-family member: EGF, TGFα,
23
and AREG70,71. These ligands share a common cysteine pattern in their primary structure
(CX7 CX4–5 CX10–13 CXCX8 C) as well as forming identical cysteine bridges C1-C3,
C2-C4, C5-C671. Although some of the ErbB family members have been shown capable
of juxtacrine signaling72 most ligand signaling is, presumably, the result of autocrine and
paracrine signals after shedding from cells by cleavage of transmembrane protein
precursors71. The shedding process is mediated by members of the ADAM family of
matrix metallo-proteases; ADAM 10 being responsible for the release of EGF and
ADAM 17 for AREG and TGFα73. More recently, other soluble metalloproteinases have
been shown to cleave EGFR ligands in organ specific contexts74,75.
Despite using the same cellular receptor, these ligands have varied effects on cellular
phenotypes. This is true even when doses are saturating, suggesting intrinsic differences
in ligands beyond affinity70. At the protein level, higher levels of EGFR 1045
phosphorylation have been observed with EGF treatment as compared to AREG, and the
Y1045F mutation potentiates EGF’s effect more than AREG’s76. EGFR pY1045 is a
putative binding site for Cbl (a ubiquitin ligase)77, so it is not surprising that the rate of
receptor internalization and recycling decision depends strongly on the identity of the
ligand. It has been shown that EGF leads to greater degradation following internalization
whereas TGFα favors recycling and AREG never gets internalized in the first place69. In
MDCK cells, AREG is associated with a redistribution of E-cadherin and induction of a
spindle-like morphology whereas TGFα treatment causes no such changes78.
1.3.2.1
EGF When EGF binds EGFR it stabilizes an ‘open’ confirmation of the extracellular domains
for dimerization with another ErbB family member open molecule. The binding event
also induces changes in the intracellular portion of the receptor molecule activating the
kinase domain, which phosphorylates the partner molecule of the dimer79. These
phosphorylation sites complete binding motifs for a wide range of SH2- and PTB-domain
containing proteins, which are subsequently recruited to the receptor3,80. For instance, Cbl
(an E3 ubiquitin ligase) is most efficiently recruited to a motif containing EGFR
pY104577. These adaptor proteins serve as the first step in a number of signal
transduction cascades downstream of EGFR, including Ras, MAPK, Src, STAT 3/5,
24
PLCγ/PKC, and PI3K/AKT81. Each of these pathways has a role to play in cellular
phenotypes including proliferation, apoptosis, differentiation, and migration82. Following
stimulation with EGF, EGFR is endocytosed and a fraction is degraded. After 10 minutes
of EGF stimulation, almost 40% of EGFR colocalizes with EEA1 (an endosomal marker)
and after 30 minutes surface levels have decreased by 50%69. It is thought that the rapid
downregulation through trafficking limits EGF’s ability to potentiate phenotypic
response83,84.
1.3.2.2
TGFα TGFα was first reported originating from three different cultured human tumor lines and
identified by its ability to confer a transformed phenotype to untransformed fibroblasts85.
TGFα expression has since been mapped to the epithelial cells of many tissues, including:
digestive tract, liver, pancreas, kidney, thyroid, adrenal, skin, mammary gland, and
genital organs86. TGFα has many similarities with EGF; it is very close in size and
structure49,59 and binds EGFR with similar affinity87. Early studies suggested a different
binding conformation for TGFα, because TGFα-binding could be inhibited by antibodies
raised against EGFR, but EGF-binding could not88. Twenty years later, it was shown that
there exist subtle structural differences in the extracellular domain II of TGFα-bound
EGFR when compared with EGF-bound EGFR70, enticing the assumption that there is
also structural differences present in the intracellular domain of the molecule. Such
differences were confirmed in the juxtamembrane helical domains of EGFR using
bipartite tetracysteine display56 and lend credence to the hypothesis. TGFα’s difference is
most notable in the trafficking patterns it induces in EGFR. TGFα-bound EGFR is
endocytosed similarly to EGF-bound EGFR69. Unlike EGF, TGFα’s pH-dependent
affinity for EGFR89 permits the ligand-receptor complex to dissociate in the acidic
environment of the endosome, ultimately recycling EGFR to the surface instead of
divining it for degradation69,89,90. Three addition histidine residues present in the receptorbinding portion are suspected to be responsible for this additional pH-dependence of
TGFα91. Mutant EGF containing these histidines shows similar recycling and mitogenic
potential to TGFα90,92.
25
1.3.2.3
AREG AREG is a much larger molecule than either TGFα or EGF with 78 amino acids in its
truncated form (84 in a non-truncated form). The C-terminal portion of the molecule
bears close resemblance to EGF and shares its pH-insensitive affinity for EGFR. The
additional N-terminal portion is very hydrophilic and contains a disproportionate number
of arginine, lysine, and asparagine residues93. Initial studies placed the EGFR affinity of
AREG lower than that of EGF94, subsequent studies show comparable affinity for
recombinant ligands95,96, and further work showed that AREG does not induce
dimerization as efficiently97. It has also been shown that the heparin-binding domain
(residues 26-44) is homologous to that of HB-EGF and has been shown to be crucial to
its mitogenic capacity98,99. These potential differences translate into numerous reports
that AREG is not as potent at inducing EGFR phosphorylation in cell lines69,100,101.
1.3.3
Key Downstream Signaling Nodes The EGFR signaling network consists of many proteins that serve to interpret the
extracellular ligand identity and concentration. Some of these proteins are additional
kinases responsible for phosphorylating a unique set of substrates to orchestrate an arm of
the network and some are phosphatases responsible for removing those modifications.
Others are adaptor proteins that contain phosphotyrosine-specific binding sites that serve
to bring multiple signaling components into close proximity.
1.3.3.1
ERK1 and 2 Extracellular Regulated Kinase (ERK) 1 and 2 are members of the Mitogen Activated
Protein Kinase (MAPK) family of cytosolic kinases that are activated through a signaling
cascade (Ras-Raf-MEK-ERK) downstream of RTKs102, such as EGF-family ligand
activation of EGFR as one example. Upon stimulation, the receptor recruits Shc, Grb2,
and finally Sos1/2 which serves as the Ras guanine-nucleotide exchange factor (GEF)4.
Ras activates Raf (A-, B-, C-Raf) through a dimerization process103, which then activates
MEK1/2: the dual-specificity kinases whose only known physiological targets are
ERK1/2104. The presumed purpose of this three-tiered architecture is rapid signal
amplification4. MEK concentration has been recorded to be 100-fold higher than that of
Raf, while ERK is 2/3 that of MEK105. The amplification procedure is able to rapidly
induce phosphorylation of 60% of endogenous ERK105.
26
ERKs’ broad substrate specificity gives it the ability to phosphorylate over 175 targets106
in many protein classes, including: transcription factors, kinases, and other
enzymes107,108. This wide selection of cellular targets links ERKs’ activities to many
phenotypes, such as proliferation109, migration67, differentiation, metabolism and cell
survival4.
These phenotypes are critical to cancer progression and ERKs’ importance is apparent
since many members of the signaling cascade are subject to mutation. Looking upstream,
overall 30% of cancers carry a Ras mutation110 and 7% a Raf mutation111. The statistics
are even more jarring when subtypes of cancers are considered. KRas is mutated in an
overwhelming 58% of pancreatic cancer, 33% of colorectal cancer, and 31% of biliary
cancer, while NRas is mutated in 18% of melanomas112. Raf mutations also focus in
specific subtypes of cancer, specifically 40-60% of melanomas, 40% of thyroid cancers,
30% of ovarian cancers, and 20% of colorectal cancers111,113. Even when no mutation is
present in these core proteins, overactivation of the cascade is common in many cancers4.
ERK1 (379 aa) and Erk2 (360 aa) are very similar in that they share 84% of their
sequence114, are ubiquitously expressed102, and are activated in parallel under all known
conditions109. Differential viability following individual knockdown suggests separate
functions of each ERK, but may be due to higher expression of ERK2 in many cells,
permitting it to compensate for a loss of ERK1114. Relative activation of the two isoforms
is proportional to their expression level115 and whether isoform-specific functions truly
exist is still an open question4.
Activation loop phosphorylation has proven to be a common regulatory method for
protein kinases116. Not surprisingly, phosphorylation of the activation loop of ERK by
MEK is required for activity. All MAPKs contain the TxY sequence in their activation
loop and for both ERK1 and ERK2 the sequence is TGY. The tyrosine is phosphorylated
first and the production of the dual phosphorylated form of ERK is non-processive117,118.
Tyrosine phosphorylated ERK dissociates from MEK and then reassociates so that MEK
can phosphorylate the neighboring threonine119. Studies with site-specific
phosphorylation inhibitors claim that MAPK is only active when both the threonine and
27
the tyrosine are phosphorylated. The kcat of ERK2 has been reported to increase 50,000
upon activation and specificity increases 600,000-700,000 fold120.
1.3.3.2
Akt Akt, also known as Protein Kinase B (PKB), was identified in a cloning screen intended
to find proteins with kinase domains similar to PKA and PKC121,122. The Akt protein
family consists of three known members; the latter members AKT-2123 and AKT-3124
were identified in similar subsequent studies. PKB isoforms β (AKT-2) and γ (AKT-3)
isoforms share 82% sequence identity with the α (AKT-1) isoform124 and are
differentially expressed at both the mRNA level and the protein level125-127. PKB α and β
isoforms are widely expressed in brain, thymus, heart, and lung121,122,125 whereas the γ
isoform is more restricted with high expression limited to brain, testes, lower heart,
spleen, lung, and skeletal muscle126. Orthologs have been found in flies128 and worms129,
which share similar structure with the exception of the C-terminal tail that is only found
in some species and isoforms130.
Akt’s key role is as a serine/threonine kinase121,131 responsible for phosphorylating the
consensus motif RXRXXS/T132. The phosphorylation motif identified for Akt is present
in many apoptotic machinery proteins, as well as transcription factors, allowing it to
suppress apoptosis133 and promote survival in concert with PI3K130. c-AKT is recruited to
3’-phosphorylated phospholipids134, the kind produced by activated PI3K135 through its
N-terminal PH domain130. This binding event has been shown to be critical for activation
of Akt136,137, and it has been suggested that binding induces an intramolecular
conformational change that permits activating phosphorylation138. Further support for
this hypothesis comes from the fact that Akt truncations lacking the PH domain are still
capable of becoming phosphorylated130.
The PH domain also serves to recruit Akt to the cellular membrane139 and mediate
intermolecular interaction in multimers140, both of which are important for regulation.
These interactions bring Akt into close proximity with kinases responsible for
phosphorylating Thr-308 and Ser-473 (whereas other phosphorylation sites including
Ser-124 and Thr-405 are basally phosphorylated)132. Thr-308 and Ser-473
28
phosphorylation events are critical to Akt activation131,139,141 and it has been shown that
phosphomimetic mutagenesis at these sites partially activates Akt132.
It was soon realized that Akt was the cellular homolog to the transforming oncogene
produced by the AKT8 retrovirus – a fusion protein between Akt and a viral gag
sequence125,131,142,143. Spatial control is lost with v-Akt, because it is constitutively
targeted to the membrane through the viral gag sequence, which indirectly bestows
constitutive kinase activity137. The importance of this targeting was confirmed when cAkt was also found to be constitutively active when targeted to the membrane by
myristoylation144.
It is not surprising that induction of Akt’s kinase activity was first shown in response to
growth factors such including EGF137,144. EGFR activation leads to activation of PI3K,
which phosphorylates lipids which recruit Akt130. Hence it is not surprising that v-Akt
shows significant decreases in ligand-dependence of activation144 as it is able to
circumvent the upstream control in the pathway.
Akt’s phenotypic implications have been extensively explored and heavily involve
survival signaling130. Akt has been shown to be necessary for survival, because a
dominant-negative form is able to abrogate IGF-1’s ability to promote survival. In the
same cells, wild-type or constitutively active Akt permitted survival even without IGF1145. As Datta et. al point out in their review, constitutively active Akt does more than
just promote survival, it has been shown to counteract many apoptotic signals130. The
high level goals of survival and evasion of apoptosis are likely composed of many lower
level responses, such as metabolic processes where Akt is often implicated146.
1.3.3.3
PI3K The first indirect evidence for a connection between EGFR and phosphoinositide 3kinase (PI3K) was observed when EGF was shown to increase radioactive phosphate
labeling of an unusual phosphatidylinositol147. The phosphatidylinositol product turned
out to be phosphatidylinositol-3,4,5-triphosphate (PIP3) and has been shown to be a
critical second messenger that recruits AKT and governs growth, proliferation, and
survival148. The enzyme responsible for catalyzing this reaction is one of a large family
29
collectively known as phosphoinositide 3-kinases (PI3Ks). They are heterodimers
consisting of a regulatory subunit responsible for mediating interactions with upstream
RTKs paired with a catalytic subunit containing the kinase domain148. The PI3Ks are
broken down into classes depending upon which constituent subunits they contain. Class
1A PI3Ks are most strongly implicated in cancer149. Their regulatory subunit (p85α,
p55α, p50α, p85γ, p55γ) contains the obligatory p110-binding domain as well as two SH2
domains and an optional SH3 domain, which permit interaction with phosphotyrosines
contained in the pYxxM motif as well as proline-rich domains. The catalytic subunit
contains a p85-binding domain as well as a Ras-binding domain, a PIK-homology
domain, a C2 domain, and a catalytic domain150.
Class 1A PI3Ks are subject to frequent genetic alterations, so much so that they have
been identified as one of the most disregulated pathways in cancer149. Amplification of
PI3K is not the only way that the PI3K/Akt pathway is disregulated. Other modalities
include: PTEN (one of the phosphatases responsible for negative regulation of PIP3) loss,
Akt mutation, and RTK amplification149. These pathway alterations may occur along or
in concert to promote tumorigensis. In particular, PI3K amplification has been shown to
coexist with PTEN loss in 8.7% of breast cancers, as well as with HER2
overexpression149. These data have made PI3K a common target for pharmaceutical
intervention with ATP analogues151 including pan-specific Wortmannin which forms a
covalent bond with a lysine residue in p110α152 as well as LY294002 which utilizes a key
hydrogen bond to mimic the ATP interaction153.
1.3.3.4
Shc The first member of the Shc (Src homology and collagen) protein family was discovered
while screening cDNA libraries with sequences complimentary to known SH2domains154. More detailed analysis revealed three isoforms (p46, p52, and p66) of what is
now known as ShcA155. Isoforms p46 and p52 are widely expressed, while p66 is more
controlled and conspicuously absent from certain lineages, such as haematopoietic
cells154,156 and simultaneous disruption of p66, p52, and p46 is embryonically lethal155,157.
A conserved core structure containing PTB-CH1-SH2 domains are common to all
isoforms with differences occurring in the N-terminal section. Isoforms p46 and p52 are
30
produced from alternate start codons within the same transcript, whereas p66 is the result
of an alternative splice adding an additional 110 amino acids to the N-terminus. These
additional amino acids are rich in glycine and proline providing potential SH3-binding
motifs158.
Shc was one of the first cellular adaptors to be characterized and serves as a prototypical
example of this class of proteins159. Consistent with its role as an adaptor protein, each of
the identified domains of Shc is responsible for mediating interactions with other cellular
proteins and structures160. Sh2- and PTB- domains are responsible for binding
phosphorylated tyrosine motifs providing Shc the ability to interface with a number of
key nodes involved in signal transduction pathways including: growth factor signaling,
especially Ras/MAPK and PI3K/AKT cascades downstream of RTKs, integrins and
cytosolic kinases, as well as cytokine, antigen, and G-protein-coupled receptors159. In
additional, the collagen homology (CH) domain is rich is glycine and proline158,161
providing SH3-binding motifs for proteins such as the Src-family kinases159. This domain
also contains important tyrosine sites (Tyr-317 as well as the tandem Tyr-239/240) that
become phosphorylated upon RTK activation and provide binding sites for other key
signaling proteins such as the adaptor Grb2162. Shc is not limited to recruiting kinases and
their substrates. There are plenty of phosphatases among its known interacting partners,
including: PTP-PEST (which binds through the PTB domain)163,164; PTEN (implicated
strongly in migration)165,166; as well as PP2A and PTPε which are known to act on Shc by
impeding Grb2 interaction and ultimately MAPK activation167,168. In addition, the PTBdomain of Shc has been reported to perform a similar function to a PH-domain: binding
phospholipid169,170. Interestingly, once phosphotyrosine binding is complete, the capacity
to bind phospholipid is diminished169. These data paint a picture where ShcA is
preemptively bound to the membrane making it readily accessible to activated
receptor155,170. Such a working hypothesis is strengthened by studies showing very rapid
induction of Shc phosphorylation following ligand treatment. A stopped-flow apparatus
was able to capture very short time points and show that EGFR pTyr-1173
phosphorylation occurs within 5s with near immediate recruitment and phosphorylation
of Shc171.
31
Shc also binds to other ErbB-family members including HER2. Studies by Dankort et. al
investigated the importance of specific sites on HER2 for Shc and Grb2 recruitment and
their tumorigenic capacity172,173. Not surprisingly, when the analogous terminal
phosphorylation sites of HER2 are removed, HER2 loses the ability to transform cells.
Against this background, reconstitution of only one site (Tyr 1227, known to bind Shc) is
sufficient to restore transformative capacity, increase MAPK and PI3K signal, and
stimulate the development of tumors172,173. In support of ShcA’s important role in the
transformation process, ablation of ShcA prevented tumor development155,174.
The complex interplay does not end there. To speak to its downregulatory activities, Shc
has been linked to EGFR internalization and degradation following ligand stimulation175177
. More specifically, ShcA has been linked to clathrin-coated pit formation through its
ability to recruit α-adaptin and ultimately AP-2176. Consistent with this hypothesis, data
show that ShcA and EGFR co-localize to EEA1-positive vesicles, likely because Shcbinding sites (pTyr-992 and pTyr-1173) remain phosphorylated during the internalization
process175. By the time that early endosomes progress to LAMP-1-positive late
endosomes and lysosomes, only a subset of endosomes retain EGFR pTyr-1173 and none
co-localize with ShcA175.
Less dramatically, phosphorylation of Tyr-317 has been shown to be required for PP2a
dissociation167. A similar hypothesis has been posed by a series of molecular dynamics
simulations carried out by Suenaga et. al. They implicate pTyr-317 in decoupling
concerted movements of distal PTB and SH2 domains by increasing the rigidity of the
CH1 domain178. This decoupling can lead to the disruption of interactions between those
domains and cognate motifs on EGFR, removing protection of EGFR phosphorylation
sites, and facilitating the eventual termination of the transduction process179. This model
does not account for concomitant pTyr 239/240155.
These types of computation simulations build upon much experimental work that has
been done to link Shc to EGFR signal transduction. As early as 1994, binding studies
were performed on the SH2-domain of Shc to show that pTyr-1148 and pTyr-1173 were
binding sites on the activated receptor180. The work also indicates pTyr-992, which may
be necessary to explain why phosphorylation of Shc is induced upon EGF stimulation
32
even when Tyr-1148 and Tyr-1173 have been removed181,182. Further studies suggested
that it was indeed pTyr-1148 responsible for dominant engagement of the Shc PTBdomain, while pTyr-1173 provided supplementary binding capability to the SH2- as well
as the PTB-domain of ShcA183-185. Co-immunoprecipitation experiments between Shc
and EGFR supported the initial hypothesis that these binding events were redundant186,
but subsequent work indicated that more robust signaling occurred when the PTB-domain
was involved184 supporting the hypothesis that cooperation leads to optimal MAPK
activation155,187.
The Shc-family contains four members: ShcA, B, C, and most recently D155 and likely
arose from a common ancestor161. These homologs all share general domain structure
described above. Although the presence of multiple homologs may complicate the picture
of Shc signaling, the ShcA cascade is thought to be the ‘redundant but dominant’ path
employed by cells to activate MAPK from EGFR155,188. The model used to generate this
hypothesis also suggests that indirect Grb2 recruitment through ShcA is preferred to
direct recruitment to EGFR188. Experimental verification of the dominance of the ShcA
cascade comes from MEF studies where activation of MAPK is possible in the absence of
ShcA, but ShcA is crucial for sensitization to low growth factor concentrations157.
1.3.3.5
Gab1 Gab1 (Grb-2 Associated Binder 1) is a scaffolding adaptor protein and as the name
suggests was originally discovered in an glial tumor expression library for its ability to
bind Grb2189. As the name suggests, most of Gab1’s interactions with receptor tyrosine
kinases occur indirectly through intermediates such as Grb2 whose SH2-domain targets
Grb2-Gab complexes to RTKs containing the proper binding motif [YXNX]190-192. Even
in an instance where Gab1 has been shown to be capable of direct binding to the c-Met
receptor, indirect interactions have been shown to be physiologically relevant193,194.
Once Gab1 has associated to a receptor it is equipped to facilitate a host of other proteinprotein interactions. Like other Gab-family proteins, Gab1 contains an amino-terminal
PH-domain which recruits it to the cell membrane, tyrosine-containing motifs for the
recruitment of SH2- and PTB-domain-containing proteins, proline-rich domains for
recruiting SH3-domain-containing proteins190,195-199 including PLCγ189, Crk200,
33
CrkL201,202, Shc, SHP2, Ship, and PI3K203. The many phosphorylation sites of Gab-family
proteins are often phosphorylated by the RTK’s to which they are recruited, but can also
be the target of cellular kinases190,204-206 further increasing the potential complexity of
signaling dynamics.
Gab1 is part of a family of binding proteins. Gab1, Gab2, and Gab3 have a global
homology of 40-50% with higher conservation in the PH-domain, SH2-, and SH3binding motifs199. In simpler organisms the family consists of one member (DOS
(Drosophila) and Soc1 (C. Elegans))207, which presumably diversified to provide
additional function as organismal complexity increased, thus speaking to its importance
in signal transduction. Gab1, in particular, is expressed in many tissues throughout the
body with especially high levels observed in the brain, kidney, lung, heart, testis, and
ovary189,199.
1.3.3.6
Grb2 Grb2 (Growth Factor Receptor-bound Protein 2) is an adaptor protein initially discovered
to link EGFR signaling to the Ras-MAPK pathway and has since been shown to be
involved downstream of many distinct RTKs208-210. Grb2 contains one SH2 domain and
two SH3 domains208 allowing it to act as a puzzle piece capable of linking many other
proteins.
Grb2’s most well studied interactions are those surrounding Sos – to which Grb2 is
constitutively associated by way of its SH3 domain. Sos is a guanine nucleotide exchange
factor responsible for swapping GDP for GTP on Ras. This task is more efficiently
accomplished when the Grb2 SH2-domain recruits the Grb2-Sos complex to RTKs, thus
bringing it into close proximity with membrane associated Ras211,212. Grb2 has also been
reported to bind Shp2, an important interaction in the context of FGF signaling213; as well
as Shc and IRS1, important interactions for insulin signaling through the insulin
receptor214.
More recently, the multiple binding domains of Grb2 have been recruited to explain the
rapid activation properties of FGF receptor 2 (FGFR2). Lin et al. provided evidence to
support the hypothesis that Grb2 association with FGFR2 prior to stimulation serves to
34
preform dimers and allow phosphorylation of the receptor activation loop necessary for
rapid response to ligand stimulation. Once FGF is presented, FGFR2 phosphorylates
Grb2 inducing dissociation and removing the inhibition215. The hypothesis presents an
interesting scenario lending weight to the “preformed dimers” school over the “random
association” school of receptor dimerization and activation216.
1.3.3.7
IRS2 IRS (Insulin Receptor Substrate)-proteins were discovered in the context of insulin
signaling and were found to be multifunction interfaces217,218. They have subsequently
been linked to the signaling of many cytokine receptors80 (including prolactin, growth
hormone, leptin, vascular endothelial growth factor (VEGF), TrkB, ALK, IL-2, IL-4, IL7, IL-9, IL-13, IL-15, interferon and integrins219) and binding to various cellular proteins
(including PI3K, SHP2, Fyn, Grb2, nck, and Crk220). In the case of receptor cross-talk
with EGFR, this is not surprising noting the similarities between the intracellular portions
of these molecules; specifically in the many C-terminal phosphorylation sites completing
motifs for SH2-domain containing proteins such as the IRS-proteins221 and Shc, which is
also known to interaction with both EGFR and insulin receptor80.
IRS1 and IRS2 are 43% identical221. They contain a PH domain for membrane
association and a PTB domain similar to that of Shc222. This similarity is particularly
important in the C-terminal regions of the protein (where they share approximately 35%
identity) where downstream effectors are recruited to phosphotyrosine-binding motifs.
Many of these motifs are conserved between the two members and both have been
reported to activate PI3K223,224 and ERK1/2 in a variety of model systems including
breast cancer225,226.
Despite their ubiquitous expression and structural similarities, IRS1 and IRS2 have nonredundant functions as evidenced by knockout experiments in mice. IRS1 knockout mice
exhibit stunted growth and insulin resistance that does not proceed to diabetes227,228. In
contrast, IRS2 knockout mice are normal size, yet show brain defects due to reduced
neural proliferation227,228, develop early-onset diabetes227,229, and produce infertile
females230. These differences extend into cancer models where mammary metastasis is
promoted by IRS2 or the suppression of IRS1231,232.
35
1.3.3.8
Shp Shp proteins are a subclass of ‘classic’ protein tyrosine phosphatases (PTPs) whose
binding localization is governed by SH2 domains233. Such ‘classic’ PTPs have a 240-250
amino acid PTP domain centered among regulatory sequences directing binding and only
catalyze the dephosphorylation of phoshotyrosine. In vertebrates, there are two known
Shps (Shp1 and Shp2), which have single orthologs in Drosophila (Csw) and C. elegans
(Ptp-2)234. Shp1 expression is limited and highest levels are observed in hemetopoietic
cells234 and recent studies have shown nuclear localization due to a basic nuclear
localization sequence in place of an SH3-domain235,236. In contrast, Shp2 and orthologs
are widely expressed234 and found uniformly distributed through the cytosol235.
Shp proteins have been shown to bind a wide range of signaling proteins including RTKs
(such as EGFR, SCFR, PDGFR, and EPOR) as well as adaptor proteins (such as Grb2,
FRS2, Jak2, p85 subunit of PI3K, IRS2, and Gab1-2237), which are central to transducing
extracellular information across the membrane. Although Shp1 fits the stereotypical role
of reducing phosphorylation and quenching signaling, Shp2 has been implicated in
diverse roles. For instance, Shp2 has been shown to help activate downstream signaling
transduction by serving as an adaptor protein for Grb2 binding to RTKs238,239. Shp2 has
been implicated in Erk signaling downstream of many RTKs. Without proper Shp2
involvement Erk activation is not sustained or even initiated240,241, and may link Shp2 to
cell cycle progression238,239.
Multiple methods of regulation of Shp activity have been hypothesized. Shp proteins
contain two known tyrosine phosphorylation sites in the C-terminal tails, which are
phosphorylated differentially by receptor and intracellular kinases234. The Sh2-domains
have also been shown to regulate phosphatase activity242. Experimentally,
phosphotyrosine containing peptides that bind the N-SH2-domain increase phosphatase
activity. A structural explanation posits that in the absence of a binding site, the N-SH2domain instead binds the phosphatase domain blocking the active site243. It is less clear
whether the same holds true for the C-SH2-domain and it may simply serve to increase
target-binding specificity234,241.
36
1.3.3.9
Cbl Cbl (short for Casitas B-lineage lymphoma) was first discovered as part of a gag-Cbl
fusion protein driving B-lineage lymphomas244. c-Cbl is a cytosolic protein ubiquitously
expressed throughout the body with known homologs in mammals, worms, fish, birds
and frogs. It serves as an E3 ubiquitin ligase245,246 responsible for recognizing
phosphorylated proteins destined for ubiquitin-mediated degradation247. Cbl interactions
have been reported with a wide range of proteins including receptors, intracellular
kinases and phosphatases, adaptors, cytoskeletal proteins, as well as ubiquitin247-249.
Cbl’s N-terminal portion contains a calcium-binding EF hand as well as a SH2-domain
with consensus sequence (N/D)XpY(S/ T)XXP247,250,251, but c-Cbl has also been shown to
interact with Met, where it binds DpYR252. The RING finger domain mediates c-Cbl’s
ubiquitin ligase activity by connecting Cbl (the E3 ubiquitin ligase) to the E2 ubiquitin
ligase245,253. The short linker connecting the SH2-domain and the RING finger domain
has been shown important for ubiquitination, and mutations can be transforming254.
Taken together, these domains provide the baseline function of Cbl: the SH2-domain
recruits Cbl to a phosphorylated protein and RING facilitates ubiquitination247.
Additional proline-rich domains in the C-terminus provide putative SH3-binding
motifs255 and vary between isoforms and species247.
The C-terminal portion of Cbl also contains sites for tyrosine phosphorylation, a leucine
zipper homogy domain, and a ubiquitin-associated domain247. Initial studies showed that
Bcr-Abl, v-Abl, Abl, and v-src were capable of phosphorylating Tyr-700 and Tyr-774
providing putative binding for the crkl SH2-domain256. This list expanded when Fyn was
shown to phosphorylate Tyr-731 and provide a binding motif for PI3K257. These results
have since been recapitulated in vitro to show that the kinase domains of Syk, Fyn, and
Abl phosphoryle c-Cbl differentially258. Tyrosine phosphorylation in the linker region
was hypothesized to modulate E3-activity253 and was later confirmed in the EGFR system
with phosphomimetic mutagenesis259.
Cbl participation in EGFR signaling has been extensively studied and serves as a model
for understanding the roles that phosphatases can play in signal transduction. Cbl has
been shown to ubiquitinate EGFR leading to its downregulation by endocytosis253,260.
37
Swaminathan et al. comprehensively outline this process in their review247. Briefly,
activated EGFR autophosphorylates leading to pTyr-1045, which recruits Cbl to the
receptor253. Cbl mono-ubiquitinates EGFR261 while it is at the plasma membrane262.
Eps15 is thought to play a role in recruiting EGFR to clathrin-coated pits and adaptor
proteins, including AP2263,264, which develop into clathrin-coated vesicles, and merge
with internal vesicles to become multivesicular-bodies that are ultimately sorted for
recycling or lysosomal degradation265. Cbl remains bound to EGFR through much of this
process261,266. Mounting evidence suggests that Cbl-mediated ubiquitination is important
for sorting rather than initial internalization77,266. Cbl has also been shown to be recruited
to EGFR indirectly through Grb-2, as seen in EGFR Y1045F mutants that are still
endocytosed and degraded77,267. Mutations to EGFR, Grb2, or Cbl to interfere with their
interactions and reduces EGFR endocytosis268,269. Evidence suggests that Grb2-mediated
and pTyr-1045-mediated Cbl recruitment may result in ubiquitination of unique lysine
residues of EGFR77; a potential explanation for differential fate247. Although EGFR is the
best-studied ubiquitin-mediated degradation target of Cbl, many other receptors and
intracellular proteins have also been shown to be targets (for extensive list see
Swaminathan et. al 2006). Furthermore, Cbl has been implicated as an adaptor protein in
Erk activation in response to Met activation200 and G-SCF stimulation270.
1.4 Mass Spectrometry Mass spectrometry is an attractive technology for proteomic analyses, because it allows
for simultaneous data to be collected for hundreds or thousands of proteins from a single
biological sample. Most applications digest proteins into peptides with sequence specific
endopeptidases, such as trypsin, to produce predictable subsequences that can be matched
back to the contributing protein by way of database comparison to a species’ proteome.
With proper purification techniques, peptides containing particular post-translational
modifications (PTMs) such as phosphorylation or acetylation can be isolated and
analyzed to determine signature profiles following differential treatment271. The most
successful sample introduction method for peptide MS experiments has been liquid
chromatography (LC), which provides real-time separation of peptides to simplify
mixtures.
38
1.4.1
MSMS Peptide Analysis During a liquid chromatography mass spectrometry (LCMS) experiment, peptides are
ordered based on their elution from a column. At any one time, only a small fraction of
all peptides will be introduced to the instrument. The complete peptide is charged by a
voltage source connected through the liquid supply (electrospray ionization) and
accelerated through a potential gradient before arriving at a detector. In the full scan mass
spectrum (MS1), the mass-to-charge ratio (m/z) and the abundance of each peptide are
recorded. If a proper ion series is observed and the intensity of the signal surpasses a
threshold, the ion may be selected for fragmentation by collision with an inert gas. The
resulting fragments are collected and again analyzed to generate an MS/MS spectrum
(MS2)272. Abundant fragment ions from the isolated precursor will produce peaks that
can then be compared against lists of predicted fragments for peptides of known
sequence.
1.4.2
Sample Purification (IP/IMAC) Mass spectrometry-based peptide identification and quantification is greatly aided by
sample purity. During a chromatographic elution, peptide identification requires that a
single sequence be isolated to ensure contributed fragments can be properly attributed. Of
even greater concern is peptide quantification, especially by isotopic labeling techniques
such as iTRAQ. Since the fragment ions used for quantification are identical for all
peptides, it is critical that the donor sequence be the only one producing them. Liquid
chromatography is very helpful is achieving this goal by reducing local sample
complexity. Any additional pre-analysis purification to simplify the mixture of peptides
will provide additional certainty to quantification. Important to the work in this thesis are
immunoprecipitation (IP) and immobilized metal affinity chromatography (IMAC)
utilized to isolate tyrosine phosphorylated peptides.
1.4.2.1
Immunoprecipitation Immunoprecipitation (IP) is a very useful technique to reduce the complexity of samples
derived from whole cell lysates. At the protein level, an antibody can be used to purify a
sample to investigate changes that occur to the post-translational modification at all sites
of a particular protein. At the peptide level, specific post-translational modifications can
be targeted to investigate changes in a particular signaling moiety across many proteins.
39
For the work in this thesis, phosphotyrosine-specific IPs were used to query signaling
downstream of EGFR, because EGFR contains a tyrosine kinase domain, as do several of
the proteins which it putatively activates. Phosphotyrosine antibodies are generally more
specific than phosphoserine or phosphothreonine, because the aryl side-chain of tyrosine
lends itself well as a comparatively unique antigen. Employing a cocktail of multiple
phosphotyrosine-specific antibodies ensures increased coverage of surrounding
sequences.
1.4.2.2
Immobilized Metal Affinity Chromatography The phosphotyrosine IP does not clean up the sample completely, because
phosphotyrosine-containing peptides are often orders of magnitude less abundant than
peptides derived from highly expressed cytoskeletal and metabolic proteins. An ironbased immobilized metal affinity chromatography (IMAC) step further purifies the
peptides collected by the IP prior to LC-MS analysis.
1.4.3
Instrument Types Several types of mass analyzers are commonly used for proteomic mass spectrometry.
Analyzer designs must balance sensitivity, analysis time, reliability, scan information,
and dynamic range. Each has its strengths and weaknesses making them best suited to
specific proteomic applications. There is much to be said about instrument differences,
and there are great differences in results from various implementations of similar
systems. In 2009, Bell et al. distributed identical protein digest samples to 27 labs and
compared their reports 273. Most fragments were found in the raw data when analyzed
centrally, but variations in reports were due to different databases and contamination.
1.4.3.1
TOF A time of flight (TOF) mass analyzer relies on the principle that more massive particles
will move more slowly when given the same energy. A batch of ions in a TOF’s trap is
suddenly subjected to a voltage difference accelerating all of the ions with an energy
based on their charge. The ratio of the velocities of two identically charged ions with
masses m1 and m2 will be proportional to v22/v12, which make the arrival time at the
detector a direct readout for the mass-to-charge ratio of the ion.
40
The TOF analyzer has several benefits. The first is that it is high accuracy as the time
resolution of the detector can be tuned to great precision. Second, the system is very fast
at measuring all ions from a single batch. The time of flight of the highest mass to charge
ratio ion dictates the analysis time. Third, TOF analyzers are very sensitive274. Finally,
the nature of the data collection method means that full scan information is available for
both MS1 and MS2.
The TOF analyzer’s accuracy depends on ions’ traversal of a known path length.
Environmental conditions that change the tube length can rapidly degrade the
measurement accuracy. Over the course of 24 hours of inactivity, the accuracy may drift
by 10’s of ppm. The newest TOFs on the market require daily calibration prior to use to
account for small changes in room temperature. In addition to calibration, TOFs are not
immune to the dynamic range concerns that plague all trap-based instruments274. To
prevent accuracy drift during continuous use, the instruments may continuously tune
themselves based on peaks in the sample.
1.4.3.2
Orbitrap The orbitrap is the newest of the mass analyzer types and is the brainchild of Alexander
Makarov 275. The oscillation time along the length of the electrode is detected and
matched to a specific m/z in an analogous way to how a Fourier transform ion cyclotron
resonance (FT-ICR) instrument would measure orbit times. The result is a very high
resolution measurement over a large mass range276. These qualities have made it the tool
of choice for many proteomics endeavors.
As with TOF analyzers, the orbitrap provides full scan information for peptide fragments
permitting high confidence identifications and de novo sequencing of novel peptides. The
high resolution of these instruments increases certainty in identifications, as well as
improves label-based quantification strategies such as iTRAQ. iTRAQ fragment peaks
were chosen to fall in a desolate region of the m/z range, but contaminant ions still hinder
quantification in some cases. The 116.11 peak of both commercially available iTRAQ
kits can be infringed upon by a sequence-specific peak at 116.07; a difference that is
easily seen by the orbitrap analyzer permitting clean quantification of this label277.
41
The first drawback to FT-based instruments such as the orbitrap is that they are slower
than their TOF-based counterparts. The same cycling time that increases the resolution of
any one scan also means that fewer scans can be performed. The newest generation
orbitrap instruments have decreased the size of the analyzer permitting faster cycling and
analysis. The second drawback is due to the trap-based nature of the ion collection. Since
the batch-size of ions is fixed, lower abundance ions are less likely to contribute
sufficient signal hurting the overall in-scan dynamic range.
1.4.3.2.1
Theoretical Background The orbitrap mass analyzer uses a complex potential consisting of a quadrupole field
superimposed with a logarithmic field.
𝑈 𝑟, 𝑧 =
𝑘 ! 𝑟!
𝑘
𝑟
𝑧 −
− 𝑅! ! 𝑙𝑛
+𝐶
2
2
2
𝑅!
where 𝑅! is a characteristic radius below which ions can be trapped. This potential is
achieved by a very careful choice of electrode geometries specified by:
𝑧! 𝑟 =
𝑟 ! 𝑅! !
𝑅!
−
+ 𝑅! ! 𝑙𝑛
2
2
𝑟
The equations of motion resulting from this geometry are quite complex and the radial
and angular components cannot be solved analytically. However, the motion along the zaxis results in a simple harmonic oscillator equation for which an exact solution is
known:
𝑧 𝑡 = 𝑧! cos 𝜔𝑡 +
2𝐸!
sin 𝜔𝑡
𝑘
where 𝐸! is the energy along the z-axis, and the fundamental frequency,
𝜔=
42
𝑘𝑞
𝑚
The remarkable part of the trap design is that the oscillations in the axial oscillation
frequency now provides a direct measure of the mass to charge ratio275.
1.4.3.3
Triple Quadrupole A triple quadrupole (TQ) instrument is quite different from both TOFs and orbitraps.
During an MS1 analysis, the first quadrupole (Q1) is used to isolate one precursor
window at a time so that its intensity can be measured. During MS2 analysis, Q1 is tuned
to pass a single precursor window into Q2 for fragmentation. A single m/z window for
the fragments is then isolated in Q3 and the intensity is measured. This pair of Q1
isolation window and Q3 isolation window is called a “transition”.
The TQ instruments are tailored to applications where dynamic range is of the utmost
importance. Since batches of ions are never held statically in a trap, stochastic variation
between batches is never produced. Instead a continuous stream is passed through the
quadrupoles for a fixed amount of time and the intensity recorded. In this way, the
abundance of other ions outside the isolated m/z windows do not affect the intensity of
the recorded signal except in competition for charge. This feature is ideal for confirming
the presence and signal intensity of a handful of known peptides in a complex sample278.
The drawback of the TQ system arises from the use of the quadrupole itself. In practice,
the quadrupole is tuned to pass a very narrow range of ions. To produce a full scan mass
spectrum the quadrupole must sequentially select narrow mass windows across the entire
mass range. The result is a very slow process with poor duty cycle time, which wastes the
majority of ions of any particular m/z. This is in stark contrast to both TOF instruments
and orbitraps, which simultaneously analyze all m/z values. For these reasons, scanning
quadrupoles are rarely used to collect full scan mass spectra. Instead, they can be used in
series to very accurately isolate particular windows in the precursor and fragment scans.
Operating in this way is known as single reaction monitoring (SRM), or multiple reaction
monitoring (MRM) when many such pairs are sequentially collected.
1.4.3.3.1
Theoretical Background The equations of motion for a charged particle in a quadrupole field are the result of a
force balance and can be derived as follows:
43
𝑞𝐸 = 𝐹 = 𝑚𝑎
𝑎 = 𝑞𝐸 𝑞 𝜕𝛷
=
𝑚 𝑚 𝜕𝑥!
The electric potential inside the hyperbolic quadrupole field (Figure 1-­‐1a) can be
expressed as
𝛷 = 𝛷!
𝑥! − 𝑦!
𝑟! !
where 2𝑟! is the separation of the rods of the quadrupole. Plugging this in to the force
balance expression yields,
𝜕!𝑥
2𝑞
=
𝛷 𝑥
!
𝜕𝑡
𝑚𝑟! ! !
𝜕!𝑦
2𝑞
=−
𝛷 𝑦
!
𝜕𝑡
𝑚𝑟! ! !
There are two components to the potential 𝛷! ; and unchanging portion (U) and a variable
portion (V),
𝛷! = 𝑈 + 𝑉 cos 𝜔𝑡
If the constant portion of the potential is zero (𝑈 = 0), then all ions above a critical mass
are stable in the field. The critical mass is a function of the frequency of the field’s
oscillation and represents those ions that are so small that they reach the rods before the
field has switched polarity. When operated in this manner, the quadrupole becomes an
ion trap.
The constant portion of the potential drives positive ions away from the positive rods and
towards the negative rods (Figure 1-­‐1b). In the absence of a variable component to the
field, no positive ions resist this force and eventually all will crash into the negative rods.
However, the superposition of a variable voltage can rescue some ions from collisions
with the rods. If the magnitude of the variable voltage is larger than the unchanging
voltage (𝑉 > 𝑈), then some ions will be stabilized. To picture the field under this
44
condition, imagine the equipotential lines (Figure 1-­‐1a) rotating. In the x-direction the
baseline potential of the rods is positive, driving all positive ions away for most of the
cycle. During the window of time when the net potential in the x-direction is negative,
smaller positive ions will have enough time to accelerate and crash into the rods before
the net potential switches back to positive. In this way, the x-direction only permits
masses above a critical value to remain stable, making it a high-pass mass filter. In the ydirection, the baseline potential of the rods is negative, attracting all positive ions for
most of the cycle. During the window of time when the net potential in the y-direction is
positive smaller positive ions with less inertia will be more effectively repelled, leaving
the larger positive ions to crash into the rods. Hence, the y-direction only permits ions
below a critical value to remain stable making it a low pass-filter. It is the combination of
the high-pass x-direction and low-pass y-direction that provide the filtering capability to
the quadrupole.
The final piece of the quadrupole used in practice is the field applied in the z-direction.
The length of the quadrupole and the strength of the field applied determines the transit
time for ions to experience the x-y fields already described. To trap positive ions in the
quadrupole, both ends are supplied a positive potential. During use as a mass filter in a
triple quadrupole instrument, there is a decrease in potential applied in the z-direction to
accelerate positive ions279.
a)#
b)#
Figure 1-­‐1: Quadrupole Potential and Ion Dyanamics. a) Equipotential lines for a quadrupole electric field. Hyperbolic rods produce hyperbolic equipotential lines. b) The path of a positively charged ion is a quadrupole field. 45
In practice the quadrupole is operated at a range of U and V values. For any particular
m/z value the stable region looks like the one shown in Figure 1-­‐2a, with the parameter
values of interest being those which produce both x-stability and y-stability. This overlap
region is a function of the m/z and can be visualized across a range (Figure 1-­‐2b). To
produce a full scan mass spectrum a scan consists of a range of U-V values queried in
series. The slope of the U-V scan path determines the resolution; as the slope increases it
intersects a smaller section of the stable region of each mass effectively reducing the
window during which any one mass is stable resulting in improved resolution.
Unfortunately, this also decreases the signal intensity. Tracing along a linear U-V scan
path produces a mass spectrum at a specific resolution (Figure 1-­‐2c).
a)"
b)"
U"
U"
R1"
x%direc+on"stable"
"
"
"
"
"
" m1"
m2"
m3"
V"
V"
Intensity"
c)"
y%direc+on"stable"
R2"
R3"
m1"
m2"
m3"
R3"
R2"
R1"
m/z"
Figure 1-­‐2: a) Stable regions of U-­‐V space for a particular m/z. The overlap of x-­‐direction stable and y-­‐
direction stable gives the U-­‐V pairs that can adequately isolate the m/z of interest. b) The overlap stable region for three unique masses for U > 0. Lines represent scan path in U-­‐V settings to produce full spectra of varied resolution (R). c) Mass spectra produced at each resolution setting (R). 46
1.4.4
Analysis Types When designing a mass spectrometry-based workflow, the type of data to be collected
will be determined by the ultimate goal of the analysis. Many studies seek to identify as
many proteins 280 or sites of post-translational modification as possible281, for which full
scan information is most useful. More recently, in depth follow-up studies to tease out
mechanism and information flow through already identified PTM’s are becoming more
popular282. It is these targeted quantitative studies that will benefit the most from multiple
reaction monitoring (MRM)283.
1.4.4.1
Data Dependent Tandem Mass Spectrometry In a typical full scan MS2 experiment target precursors are selected by their relative
abundance in the MS1 scan. Upon fragmentation, all ions produced are collected and
analyzed so that the entirety of the MS2 spectra for selected precursor is recorded. Since
target selection is based on abundance, user bias towards particular proteins and
pathways is removed inspiring databases and statistical methods to speed up data
processing and identification.
With the full scan information in hand the user has more options on how to proceed. The
MS2 peaks can be sequenced de novo or matched to those predicted for peptides in a
database, drastically reducing the time required to process data. Database search methods
have also been used to calculate false discovery rates (FDRs) by comparing results for
true databases and random ones 8, along with other more inventive statistical and machine
learning-based approaches to dealing with the wealth of information provided284,285.
The sub-discipline of proteomics focused on identifying sites of post-translational
modification would not be possible without full scan information and the aforementioned
database processing. Side-chains of key residues may be modified, thus changing the
mass of the peptide and its fragments in a predictable manner. Many sites of modification
can then be identified from the same sample and their biological significance
investigated.
The tradeoffs that must be made to attain this full scan information are not trivial. First
and foremost, the instruments used for these analyses are prohibitively expensive. Few
47
academic labs have the resources to lease or buy the high-end trap-based instruments
used in cutting edge proteomics.
On a computational note, the file size associated with full-scan data collection methods is
large. It is not uncommon for 2-3 hour experiments to generate files in the hundreds of
Mb or Gb range. Parsing and processing these large files requires more computing
resources and time. In addition, the number of potential identifications produced in
discovery runs leads many groups to avoid the time-consuming step of thorough
validation of each individual identification instead opting for dataset scale statistics like
false discovery rate (FDR)286. While this approach may prove sufficient for large-scale
statistical modeling, it provides concern for single-hit follow-up biological studies based
on the findings.
The final drawback of full scan analyses run on trap-based instruments is the issue of
dynamic range, especially in regard to quantitative data collection methods aimed at
comparing signals of drastically different scale. Consider an isotopically-labeled sample
where the abundance of a particular species is orders of magnitude larger in one condition
as compared to the rest. Peptides derived from the abundant condition will make up an
overwhelming majority of the peptide pool passed to the trap. The remaining conditions
may be so low abundance that they only contribute a handful of ions, which is
problematic if the amount is so small that it is subject to significant stochastic variation.
1.4.4.2
MRM Multiple reaction monitoring (MRM) differs from full scan data collection in that only
very specific data is collected. Usually, the goal of an MRM experiment is to confirm the
presence of a sufficient quantity of a small set of targets; which is why MRM finds a
perfect niche examining athlete blood samples for illegal compounds287. In an MRM
analysis, instead of collecting full MS2 sequence data for any abundant signal, a
diagnostic set of precursor/fragment mass pairs, termed “transitions”, are selected and
tracked throughout the course of data collection278,288. Sufficient signal on the transitions
derived from a particular target is taken as evidence of the target’s presence in the
sample.
48
One of the largest benefits to MRM experimental pipelines is reproducibility of targets
with recovery rates as high as 99% when combined across multiple analyses283, which
facilitates straightforward data collection and analysis. Once a vetted set of transitions
has been developed the analysis of additional samples is quite simple. Data processing
consists of scanning the resulting data for sufficient signal on each of the desired targets.
The file size is generally small as only a very small amount of data regarding the intensity
of key transitions is stored.
The dynamic range permitted by an MRM analysis on a triple quadrupole (TQ) mass
spectrometer is considerably larger than trap-based instruments, reported as high as 105
278
. At no time during the ion’s lifetime in the instrument is it stored in a pool of other
ions, meaning that the reported intensity of any ion is dependent on its abundance in the
sample (as well as its ionization potential), which is critically important for low
abundance species. A TQ accomplishes this by measuring ions in a constant stream for
fixed periods of time. The first quadrupole (Q1) is tuned to isolate the target precursor.
The second quadrupole (Q2) is used as a collision chamber to fragment the single
precursor stream produced by Q1. The third quandrapole (Q3), isolates a target fragment
mass from the output of Q2 and directs it to the detector 288. Once data has been
collected, the system shifts to the next transition.
The ability of the MRM system to focus on specific transitions also has its drawbacks, the
first of which is the speed of acquisition. Due to the fact that the quadrupoles must be
specifically tuned to measure each transition, the duty cycle increases very quickly278.
The duty cycle is the combined time spent scanning all transitions in the target list before
returning to scan again. When coupled to liquid chromatography, peaks corresponding to
targets may only elute for short periods. To ensure multiple collections during the elution
peak, the duty cycle must be minimized while still permitting adequate collection for
each transition. This constraint limits the number of transitions that may realistically be
targeted in a single analysis. For complex samples, the list of desired targets may quickly
exceed the capabilities of the instrument.
In addition, full MS2 sequence information is not collected. This limitation poses a
problem for confirming the identity of peaks in a transition’s elution profile. Multiple
49
species in the sample may produce a similar fragment mass from a similar precursor
mass. Such co-occurrences make it difficult to decide which peak corresponds to the
species of interest. This problem is mitigated by tracking coelution of multiple fragments
from the same precursor288 at the expense of increasing the duty cycle. To further
increase certainty of chromatographic peak identification, heavy-labeled standards are
often included to prove coelution278.
1.4.5
Labeling Techniques The standard LC-MS protocol does not allow for concurrent comparison between
samples. Peak intensities are often recorded and compared between analyses, but are
subject to complicating factors (not limited to: altered chromatography, sample specific
processing losses, instrument cleanliness, etc.). Instead, several methods have been
developed to allow combination of multiple samples into a single analysis for better
direct comparison.
1.4.5.1
Covalent Isotopic Labeling (iTRAQ, TMT) Covalent isotopic labeling techniques (such as iTRAQ289 and TMT290) provide one
method to directly analyze multiple samples simultaneously291. Both strategies rely upon
attaching a unique tag to peptides derived from each sample. Each sample-specific tag
belongs to a family of reagents, which add identical mass to all peptides from all samples.
However, upon fragmentation, each tag of the family produces a distinct fragment mass
depending on the balance of heavy/light ions contained in the jettisoned fragment. The
abundance of each member of the tag family reflects the relative abundance of the
peptide in the corresponding sample. One of the major drawbacks to using fragmentbased quantification techniques is the possibility for contamination. If multiple peptides
are included in the MS1 isolation window, then the corresponding quantification will be
their combination. In many cases, the contamination is a constant amount adding to all
signals equally, which reduces the relative ratio observed292.
1.4.5.2
SILAC Stable isotopic labeling in cell culture (SILAC) is an alternative labeling technique which
installs isotopic signatures while cells are being grown293. Heavy-labeled versions of
essential amino acids are substituted into the media of one treatment condition. The
50
sample ultimately derived from these cells will have incorporated heavy-labeled amino
acids into all of its proteins. Combining lysate from a heavy-labeled culture and a lightlabeled culture allows the contribution of each to be determined at the MS1 level based
on the differential precursor mass. The benefit to SILAC-based quantification is the early
stage in the processing pipeline at which samples are combined, thus mitigating sample
specific processing losses. The drawback is the upfront cost to properly label cells by
growing them in expensive doped-isotope-containing media.
1.5 Absolute Site-­‐Specific Quantification 1.5.1
Motivation for Absolute Quantification When trying to understand a signal-processing network, the most informative quantity is
the absolute amount of peptide present under different sample conditions. This quantity is
well suited for a wide range of mechanistic and statistical modeling, because of the
wealth of comparisons it permits between nodes in the network. Not only can the relative
amount of a signal be compared to other treatment conditions within the same analysis,
but also between multiple analyses. Inter-experiment comparison of a single signal alone
is not sufficient motivation for absolute quantification, as a properly utilized reference
sample can recapitulate the same information. Absolute quantification is indispensible
when comparing between multiple signals across many biological conditions.
The technical issue is that measured signal intensity does not provide the absolute amount
of the peptide. Many sources of peptide-specific losses exist throughout the sample
processing procedure and ultimately contribute to sequence-specific signal
intensity/amount conversions. Some of these complications include: peptide ionizability,
losses during sample processing, and differential affinities for immunoprecipitation
antibodies and metal ions. For these and other reasons, sequence-specific calibration must
be employed to provide absolute quantification data to enhance modeling efforts.
1.5.2
Previous Methods All absolute quantification techniques are based on comparing the integrated area under
the signal intensity trace corresponding to specific ions. Some work in the MS1 by
collecting ion intensity information for the intact endogenous peptide for comparison to
that of the standard peptide. Others that involve isotopic labeling rely upon comparison
51
between fragment ions correlated to peptide abundance. Each has their associated
strengths and weaknesses.
1.5.2.1
Single Point Internal Methods Single point internal methods include a known amount of a standard into the sample.
Differential labeling is required to avoid obscuring the endogenous signal and can be
accomplished by a number of methods. The added internal standard can be heavy-labeled
so as to shift the mass and avoid conflict with the endogenous version. Alternatively, the
standard peptides can be distinctly labeled with an isotopic tag.
1.5.2.1.1
iTRAQ Internal Standard In the seminal paper for the iTRAQ labeling reagent289, Ross and Pappin proposed that
their new technique could be used not only for relative but also absolute quantification.
They envisioned reserving a single tag of the family to label a known amount of a peptide
assumed to be present in the samples labeled with the remaining tags. This way, the
amount of peptide present in each sample could be calculated by way of relative peak
intensity resulting from the tags.
The strength of this method over traditional area under the curve (AUC) calibration is
that the standard peptide is partially in the sample context. It is analyzed simultaneously
with the samples allowing compensation for run-specific variation in chromatography or
charge availability. Any peptide specific losses that occur post-labeling will be accounted
for by corresponding losses of the standard. Additionally, since the iTRAQ labeling
technique is being used, the standard peptide can be synthesized using light amino acids
instead of heavy-labeled ones necessary for other related calibration techniques. By
including the isotopically-labeled peptide in the sample, the identity can further be
confirmed by coelution of the standard endogenous peptides.
The major weakness of this method is that it is a single point calibration. A single amount
of the standard peptide much be chosen a priori to be as close as possible to the expected
endogenous amounts. If the expected amount is drastically different or varies widely
between samples then the quality of calibration will suffer at the extremes. A partial
remedy to this problem is to reserve more channels of the iTRAQ experiment for
52
standard peptides at different concentrations; a losing proposition as fewer samples may
be analyzed simultaneously.
1.5.2.1.2
Heavy-­‐Labeled Heavy-labeled peptide standards match the sequence and post-translational modification
state of the peptide of interest. The only change is that a residue of the sequence has had
its C12 replaced with C13, N14 with N15, O16 with O18, and/or H1 with H2. The heavylabeled standard peptide now has a very slight mass increase that shifts it beyond the
isotope series of its endogenous counterpart. A known amount of the standard can then be
included in the sample prior to any processing without fear of contaminating the ultimate
endogenous signal intensity.
There are a number of benefits to using a heavy-labeled standard approach for absolute
quantification. The first benefit is that it may be included prior to any sample processing
and will experience any sequence specific losses other than proteolytic enzyme cleavage
(which has been shown to be insignificant under the proper digestion conditions12). Since
the peptide being included is heavy-labeled it is already distinguishable from the
endogenous version and no further labeling technique is required. Third, the endogenous
and standard peptides share the same amino acid sequence, so they interact with the
chromatographic packing material identically. Hence, coelution of the two is expected
and can be used as further verification of endogenous peptide identity.
The heavy-labeled peptide standard technique suffers from the same single-point
calibration concerns as the covalent modification techniques. As before, there is a way to
circumvent them: include more than one type of heavy-labeled standard at measured
amounts. The multiple-point calibration can then be carried out in the MS1 scan within
the dynamic range limits of trap-based instruments. The issue is the production cost of
heavy-labeled standards, which are orders of magnitude more expensive to synthesize
than regular peptides due to the purification necessary to ensure fully C13-incorporated
amino acids.
1.5.2.2
External Standard Using an external standard is the most straightforward way to do absolute quantification.
Known amounts of synthetic peptide are loaded and analyzed. This technique is widely
53
used for Amino Acid Analysis (AAA) where peptides are hydrolyzed so that their
constituent amino acids can be measured against calibrated against signal intensity from
known amounts. Success of the methodology to measure intact peptides has been
established294.
The major strength of this method is that it is a multiple-point calibration. Amounts
spanning many orders of magnitude can be measured to better understand the relationship
between signal intensity and amount. The noise floor and the saturation can be
individually determined for each peptide. Moreover, multiple-point calibration is
achieved without the need for expensive isotopic labeling reagents or isotopically pure
amino acid preparations.
The technical drawbacks to using the external standard intensities for absolute
quantification arise from the lack of sample context. When analyzing complex biological
samples many factors may influence a peptide’s signal intensity. Competition for charge
with coeluting peptides, cleanliness of the instrument, and chromatographic variations
may all influence the observed intensity and introduce error in calibration.
The other consideration with external calibration techniques is the total experimental time
required. Before absolute data can be collected, a standard curve for each peptide of
interest must be produced. Some amount of combination processing is available in terms
of measuring multiple peptide sequences in parallel. However, each amount must be
completed in a separate analysis to properly determine its individual contribution. In its
typical manifestation, the external standard calibration method is used to measure a single
sample, which means that examining a wide range of experimental conditions in series
will thus take considerably more time than multiplex labeling techniques.
54
1.6 Bibliography 1.
Taylor, S. S. & Kornev, A. P. Protein kinases: evolution of dynamic regulatory
proteins. Trends Biochem. Sci. 36, 65–77 (2011).
2.
Ozlu, N. et al. Phosphoproteomics. Wiley Interdiscip Rev Syst Biol Med 2, 255–
276 (2010).
3.
Pawson, T. & Nash, P. Protein-protein interactions define specificity in signal
transduction. Genes Dev. 14, 1027–1047 (2000).
4.
Roskoski, R. ERK1/2 MAP kinases: structure, function, and regulation.
Pharmacol. Res. 66, 105–143 (2012).
5.
Wheeler, D. L., Iida, M. & Dunn, E. F. The role of Src in solid tumors.
Oncologist 14, 667–678 (2009).
6.
Gan, H. K., Cvrljevic, A. N. & Johns, T. G. The epidermal growth factor receptor
variant III (EGFRvIII): where wild things are altered. FEBS J. 280, 5350–5370
(2013).
7.
Naegle, K. M. et al. PTMScout, a Web resource for analysis of high throughput
post-translational proteomics studies. Molecular & Cellular Proteomics 9, 2558–
2570 (2010).
8.
Perkins, D. N., Pappin, D. J., Creasy, D. M. & Cottrell, J. S. Probability-based
protein identification by searching sequence databases using mass spectrometry
data. Electrophoresis 20, 3551–3567 (1999).
9.
Yates, J. R., Eng, J. K., McCormack, A. L. & Schieltz, D. Method to correlate
tandem mass spectra of modified peptides to amino acid sequences in the protein
database. Anal. Chem. 67, 1426–1436 (1995).
10.
Curran, T. G., Bryson, B. D., Reigelhaupt, M., Johnson, H. & White, F. M.
Computer aided manual validation of mass spectrometry-based proteomic data.
Methods 61, 219–226 (2013).
11.
Chen, W. W. et al. Input-output behavior of ErbB signaling pathways as revealed
by a mass action model trained against dynamic data. Mol Syst Biol 5, 239
(2009).
12.
Gerber, S. A., Rush, J., Stemman, O., Kirschner, M. W. & Gygi, S. P. Absolute
quantification of proteins and phosphoproteins from cell lysates by tandem MS.
55
Proc. Natl. Acad. Sci. U.S.A. 100, 6940–6945 (2003).
13.
Schmidt, A. et al. Absolute quantification of microbial proteomes at different
states by directed mass spectrometry. Mol Syst Biol 7, 510 (2011).
14.
Ferlay, J. et al. Estimates of worldwide burden of cancer in 2008: GLOBOCAN
2008. Int. J. Cancer 127, 2893–2917 (2010).
15.
Jemal, A. et al. Global cancer statistics. CA: A Cancer Journal for Clinicians 61,
69–90 (2011).
16.
Hanahan, D. & Weinberg, R. A. The hallmarks of cancer. Cell 100, 57–70
(2000).
17.
Hanahan, D. & Weinberg, R. A. Hallmarks of Cancer: The Next Generation. Cell
144, 646–674 (2011).
18.
Ferlay, J., Héry, C., Autier, P. & Sankaranarayanan, R. in 1–19 (Springer New
York, 2009). doi:10.1007/978-1-4419-0685-4_1
19.
Holliday, D. L. & Speirs, V. Choosing the right cell line for breast cancer
research. Breast Cancer Res. 13, 215 (2011).
20.
Neve, R. M. et al. A collection of breast cancer cell lines for the study of
functionally distinct cancer subtypes. Cancer Cell 10, 515–527 (2006).
21.
Fridlyand, J. et al. Breast tumor copy number aberration phenotypes and
genomic instability. BMC Cancer 6, 96 (2006).
22.
Perou, C. M. et al. Distinctive gene expression patterns in human mammary
epithelial cells and breast cancers. Proc. Natl. Acad. Sci. U.S.A. 96, 9212–9217
(1999).
23.
Soule, H. D. et al. Isolation and characterization of a spontaneously immortalized
human breast epithelial cell line, MCF-10. Cancer Research 50, 6075–6086
(1990).
24.
Beerli, R. R. & Hynes, N. E. Epidermal growth factor-related peptides activate
distinct subsets of ErbB receptors and differ in their biological activities. The
Journal of Biological Chemistry 271, 6071–6076 (1996).
25.
Tsutsui, S., Ohno, S., Murakami, S., Hachitanda, Y. & Oda, S. Prognostic value
of epidermal growth factor receptor (EGFR) and its relationship to the estrogen
receptor status in 1029 patients with breast cancer. Breast Cancer Res. Treat. 71,
56
67–75 (2002).
26.
Ferrero, J. M. et al. Epidermal growth factor receptor expression in 780 breast
cancer patients: a reappraisal of the prognostic value based on an eight-year
median follow-up. Ann. Oncol. 12, 841–846 (2001).
27.
Rampaul, R. S. et al. Epidermal growth factor receptor status in operable
invasive breast cancer: is it of any prognostic value? Clin. Cancer Res. 10, 2578
(2004).
28.
Bhargava, R. et al. EGFR gene amplification in breast cancer: correlation with
epidermal growth factor receptor mRNA and protein expression and HER-2
status and absence of EGFR-activating mutations. Mod Pathol 18, 1027–1033
(2005).
29.
Rimawi, M. F. et al. Epidermal growth factor receptor expression in breast
cancer association with biologic phenotype and clinical outcomes. Cancer 116,
1234–1242 (2010).
30.
Ciardiello, F. Epidermal growth factor receptor tyrosine kinase inhibitors as
anticancer agents. Drugs 60 Suppl 1, 25–32– discussion 41–2 (2000).
31.
Mendelsohn, J. & Baselga, J. The EGF receptor family as targets for cancer
therapy. Oncogene 19, 6550–6565 (2000).
32.
Agrawal, A. Overview of tyrosine kinase inhibitors in clinical breast cancer.
Endocr. Relat. Cancer 12, S135–S144 (2005).
33.
Wakeling, A. E. et al. ZD1839 (Iressa): An Orally Active Inhibitor of Epidermal
Growth Factor Signaling with Potential for Cancer Therapy. Cancer Research
62, 5749–5754 (2002).
34.
Knowlden, J. M. et al. Elevated levels of epidermal growth factor receptor/cerbB2 heterodimers mediate an autocrine growth regulatory pathway in
tamoxifen-resistant MCF-7 cells. Endocrinology 144, 1032–1044 (2003).
35.
Rusnak, D. W. et al. The Effects of the Novel, Reversible Epidermal Growth
Factor Receptor/ErbB-2 Tyrosine Kinase Inhibitor, GW2016, on the Growth of
Human Normal and Tumor-derived Cell Lines. Molecular Cancer Therapeutics
85–94 (2001).
36.
Cohen, S. Isolation of a Mouse Submaxillary Gland Protein Accelerating Incisor
57
Eruption and Eyelid Opening inthe New-born Animal*. The Journal of
Biological Chemistry 237, 1555–1562 (1962).
37.
King, L. E., Carpenter, G. & Cohen, S. Characterization by electrophoresis of
epidermal growth factor stimulated phosphorylation using A-431 membranes.
Biochemistry 19, 1524–1528 (1980).
38.
Carpenter, G., King, L. & Cohen, S. Epidermal growth factor stimulates
phosphorylation in membrane preparations in vitro. Nature 276, 409–410 (1978).
39.
Ushiro, H. & Cohen, S. Identification of phosphotyrosine as a product of
epidermal growth factor-activated protein kinase in A-431 cell membranes. The
Journal of Biological Chemistry 255, 8363–8365 (1980).
40.
Ullrich, A. et al. Human epidermal growth factor receptor cDNA sequence and
aberrant expression of the amplified gene in A431 epidermoid carcinoma cells.
Nature 309, 418–425 (1984).
41.
Downward, J. et al. Close similarity of epidermal growth factor receptor and verb-B oncogene protein sequences. Nature 307, 521–527 (1984).
42.
Downward, J., Parker, P. & Waterfield, M. D. Autophosphorylation sites on the
epidermal growth factor receptor. Nature 311, 483–485 (1984).
43.
Honnegger, A. M., Kris, M. G., Ullrich, A. & Schlessinger, J. Evidence that
autophosphorylation of solubilized receptors for epidermal growth factor is
mediated by intermolecular cross-phosphorylation. 1–5 (1989).
44.
Carpenter, G. & Cohen, S. Epidermal growth factor. The Journal of Biological
Chemistry 265, 7709–7712 (1990).
45.
Heath, W. F. & Merrifield, R. B. A synthetic approach to structure-function
relationships in the murine epidermal growth factor molecule. Proc. Natl. Acad.
Sci. U.S.A. 83, 6367–6371 (1986).
46.
Tam, J. P., Sheikh, M. A., Solomon, D. S. & Ossowski, L. Efficient synthesis of
human type alpha transforming growth factor: its physical and biological
characterization. Proc. Natl. Acad. Sci. U.S.A. 83, 8082–8086 (1986).
47.
Cooke, R. M. et al. The solution structure of human epidermal growth factor.
Nature 327, 339–341 (1987).
48.
58
Montelione, G. T., Wüthrich, K., Nice, E. C., Burgess, A. W. & Scheraga, H. A.
Solution structure of murine epidermal growth factor: determination of the
polypeptide backbone chain-fold by nuclear magnetic resonance and distance
geometry. Proc. Natl. Acad. Sci. U.S.A. 84, 5226–5230 (1987).
49.
Montelione, G. T. et al. Sequence-specific 1H-NMR assignments and
identification of two small antiparallel beta-sheets in the solution structure of
recombinant human transforming growth factor alpha. Proc. Natl. Acad. Sci.
U.S.A. 86, 1519–1523 (1989).
50.
Tappin, M. J., Cooke, R. M., Fitton, J. E. & Campbell, I. D. A high-resolution
1H-NMR study of human transforming growth factor alpha. Structure and pHdependent conformational interconversion. Eur. J. Biochem. 179, 629–637
(1989).
51.
Klein, P., Mattoon, D., Lemmon, M. A. & Schlessinger, J. A structure-based
model for ligand binding and dimerization of EGF receptors. Proc. Natl. Acad.
Sci. U.S.A. 101, 929–934 (2004).
52.
Macdonald, J. L. & Pike, L. J. Heterogeneity in EGF-binding affinities arises
from negative cooperativity in an aggregating system. Proc. Natl. Acad. Sci.
U.S.A. 105, 112–117 (2008).
53.
Alvarado, D., Klein, D. E. & Lemmon, M. A. Structural Basis for Negative
Cooperativity in Growth Factor Binding to an EGF Receptor. Cell 142, 568–579
(2010).
54.
Bargmann, C. I., Hung, M. C. & Weinberg, R. A. Multiple independent
activations of the neu oncogene by a point mutation altering the transmembrane
domain of p185. Cell 45, 649–657 (1986).
55.
Lu, C. et al. Structural evidence for loose linkage between ligand binding and
kinase activation in the epidermal growth factor receptor. Mol. Cell. Biol. 30,
5432–5443 (2010).
56.
Scheck, R. A., Lowder, M. A., Appelbaum, J. S. & Schepartz, A. Bipartite
Tetracysteine Display Reveals Allosteric Control of Ligand-Specific EGFR
Activation. ACS Chem. Biol. 7, 1367–1376 (2012).
57.
Woo, D. D. et al. Differential phosphorylation of the progesterone receptor by
insulin, epidermal growth factor, and platelet-derived growth factor receptor
59
tyrosine protein kinases. The Journal of Biological Chemistry 261, 460–467
(1986).
58.
Lemmon, M. A. & Schlessinger, J. Cell signaling by receptor tyrosine kinases.
Cell 141, 1117–1134 (2010).
59.
Garrett, T. P. J. et al. Crystal structure of a truncated epidermal growth factor
receptor extracellular domain bound to transforming growth factor alpha. Cell
110, 763–773 (2002).
60.
Schlessinger, J. Ligand-induced, receptor-mediated dimerization and activation
of EGF receptor. Cell 110, 669–672 (2002).
61.
Burgess, A. W. et al. An open-and-shut case? Recent insights into the activation
of EGF/ErbB receptors. Molecular Cell 12, 541–552 (2003).
62.
Ferguson, K. M. et al. EGF activates its receptor by removing interactions that
autoinhibit ectodomain dimerization. Molecular Cell 11, 507–517 (2003).
63.
Jones, R. B., Gordus, A., Krall, J. A. & MacBeath, G. A quantitative protein
interaction network for the ErbB receptors using protein microarrays. Nature
439, 168–174 (2005).
64.
Schlessinger, J. et al. Regulation of cell proliferation by epidermal growth factor.
CRC Crit. Rev. Biochem. 14, 93–111 (1983).
65.
SEWELL, J., SMYTH, J. & LANGDON, S. Role of TGFα stimulation of the
ERK, PI3 kinase and PLCγ pathways in ovarian cancer growth and migration.
Experimental Cell Research 304, 305–316 (2005).
66.
McClintock, J. L. & Ceresa, B. P. Transforming Growth Factor- Enhances
Corneal Epithelial Cell Migration by Promoting EGFR Recycling. Investigative
Ophthalmology & Visual Science 51, 3455–3461 (2010).
67.
Joslin, E. J., Opresko, L. K., Wells, A., Wiley, H. S. & Lauffenburger, D. A.
EGF-receptor-mediated mammary epithelial cell migration is driven by sustained
ERK signaling from autocrine stimulation. Journal of Cell Science 120, 3688–
3699 (2007).
68.
Mukhopadhyay, C., Zhao, X., Maroni, D., Band, V. & Naramura, M. Distinct
Effects of EGFR Ligands on Human Mammary Epithelial Cell Differentiation.
PLoS ONE 8, e75907 (2013).
60
69.
Roepstorff, K. et al. Differential Effects of EGFR Ligands on Endocytic Sorting
of the Receptor. Traffic 10, 1115–1127 (2009).
70.
Wilson, K. J., Gilmore, J. L., Foley, J., Lemmon, M. A. & Riese, D. J. Functional
selectivity of EGF family peptide growth factors: implications for cancer.
Pharmacol. Ther. 122, 1–8 (2009).
71.
Harris, R. C., Chung, E. & Coffey, R. J. EGF receptor ligands. Experimental Cell
Research 284, 2–13 (2003).
72.
Iwamoto, R. & Mekada, E. Heparin-binding EGF-like growth factor: a juxtacrine
growth factor. Cytokine Growth Factor Rev. 11, 335–344 (2000).
73.
Sahin, U. et al. Distinct roles for ADAM10 and ADAM17 in ectodomain
shedding of six EGFR ligands. J. Cell Biol. 164, 769–779 (2004).
74.
Lu, X. et al. ADAMTS1 and MMP1 proteolytically engage EGF-like ligands in
an osteolytic signaling cascade for bone metastasis. Genes Dev. 23, 1882–1894
(2009).
75.
Chokki, M., Eguchi, H., Hamamura, I., Mitsuhashi, H. & Kamimura, T. Human
airway trypsin-like protease induces amphiregulin release through a mechanism
involving protease-activated receptor-2-mediated ERK activation and TNF
alpha-converting enzyme activity in airway epithelial cells. FEBS J. 272, 6387–
6399 (2005).
76.
Wilson, K. J. et al. EGFR ligands exhibit functional differences in models of
paracrine and autocrine signaling. Growth Factors 30, 107–116 (2012).
77.
Grøvdal, L. M., Stang, E., Sorkin, A. & Madshus, I. H. Direct interaction of Cbl
with pTyr 1045 of the EGF receptor (EGFR) is required to sort the EGFR to
lysosomes for degradation. Experimental Cell Research 300, 388–395 (2004).
78.
Chung, E., Graves-Deal, R., Franklin, J. L. & Coffey, R. J. Differential effects of
amphiregulin and TGF-α on the morphology of MDCK cells. Experimental Cell
Research 309, 149–160 (2005).
79.
Citri, A. & Yarden, Y. EGF–ERBB signalling: towards the systems level. Nat
Rev Mol Cell Biol 7, 505–516 (2006).
80.
Pawson, T. Protein modules and signalling networks. Nature 373, 573–580
(1995).
61
81.
Nickerson, N. K. et al. in (Gunduz, M. & Gunduz, E.) 1–31 (InTech, 2014).
82.
Witsch, E., Sela, M. & Yarden, Y. Roles for growth factors in cancer
progression. Physiology (Bethesda) 25, 85–101 (2010).
83.
Yarden, Y. & Sliwkowski, M. X. Untangling the ErbB signalling network. Nat
Rev Mol Cell Biol 2, 127–137 (2001).
84.
Avraham, R. & Yarden, Y. Feedback regulation of EGFR signalling: decision
making by early and delayed loops. Nat Rev Mol Cell Biol 12, 104–117 (2011).
85.
Todaro, G. J., Fryling, C. & De Larco, J. E. Transforming growth factors
produced by certain human tumor cells: polypeptides that interact with epidermal
growth factor receptors. Proc. Natl. Acad. Sci. U.S.A. 77, 5258–5262 (1980).
86.
Yasui, W. et al. Expression of transforming growth factor alpha in human
tissues: immunohistochemical study and northern blot analysis. Virchows Arch A
Pathol Anat Histopathol 421, 513–519 (1992).
87.
Jones, J. T., Akita, R. W. & Sliwkowski, M. X. Binding specificities and
affinities of egf domains for ErbB receptors. FEBS Letters 447, 227–231 (1999).
88.
Winkler, M. E., O'Connor, L., Winget, M. & Fendly, B. Epidermal growth factor
and transforming growth factor alpha bind differently to the epidermal growth
factor receptor. Biochemistry 28, 6373–6378 (1989).
89.
Ebner, R. & Derynck, R. Epidermal growth factor and transforming growth
factor-alpha: differential intracellular routing and processing of ligand-receptor
complexes. Cell Regul. 2, 599–612 (1991).
90.
French, A. R., Tadaki, D. K., Niyogi, S. K. & Lauffenburger, D. A. Intracellular
trafficking of epidermal growth factor family ligands is directly influenced by the
pH sensitivity of the receptor/ligand interaction. The Journal of Biological
Chemistry 270, 4334–4340 (1995).
91.
Waterman, H. Alternative Intracellular Routing of ErbB Receptors May
Determine Signaling Potency. Journal of Biological Chemistry 273, 13819–
13827 (1998).
92.
Reddy, C. C., Niyogi, S. K., Wiley, H. S. & Lauffenburger, D. A. Engineering
epidermal growth factor for enhanced mitogenic potency. Nature Biotechnolgy
14, 1696–1700 (1996).
62
93.
Shoyab, M., Plowman, G. D., McDonald, V. L., Bradley, J. G. & Todaro, G. J.
Structure and function of human amphiregulin: a member of the epidermal
growth factor family. Science 243, 1074–1076 (1989).
94.
Shoyab, M., McDonald, V. L., Bradley, J. G. & Todaro, G. J. Amphiregulin: A
bifunctional growth-modulating glycoprotein produced by the phorbol 12myristate 13-acetate-treated human breast adenocarcinoma cell line MCF-7.
PNAS 85, 6528–6532 (1988).
95.
Barnhard, J. A. et al. Auto- and Cross-induction within the Mammalian
Epidermal Growth Factor-related Peptide Family. The Journal of Biological
Chemistry 269, 22817–22822 (1994).
96.
Cook, P. W. et al. A heparin sulfate-regulated human keratinocyte autocrine
factor is similar or identical to amphiregulin. 11, 2547–2557 (2003).
97.
Neelam, B. et al. Structure-Function Studies of Ligand-Induced Epidermal
Growth Factor ReceptorDimerization. Biochemistry 37, 4884–4891 (1998).
98.
Johnson, G. R. & Wong, L. Heparan Sulfate Is Essential to Amphiregulininduced Mitogenic Signaling by the Epidermal GRowth Factor Receptor. The
Journal of Biological Chemistry 169, 27149–27154 (1994).
99.
Willmarth, N. E. & Ethier, S. P. Amphiregulin as a Novel Target for Breast
Cancer Therapy. J Mammary Gland Biol Neoplasia 13, 171–179 (2008).
100.
Willmarth, N. E. et al. Cellular Signalling. Cellular Signalling 21, 212–219
(2009).
101.
Stern, K. A., Place, T. L. & Lill, N. L. EGF and amphiregulin differentially
regulate Cbl recruitment to endosomes and EGF receptor fate. Biochem. J. 410,
585–594 (2008).
102.
Wortzel, I. & Seger, R. The ERK Cascade: Distinct Functions within Various
Subcellular Organelles. Genes Cancer 2, 195–209 (2011).
103.
Roskoski, R. RAF protein-serine/threonine kinases: structure and regulation.
Biochem. Biophys. Res. Commun. 399, 313–317 (2010).
104.
Roskoski, R. MEK1/2 dual-specificity protein kinases: structure and regulation.
Biochem. Biophys. Res. Commun. 417, 5–10 (2012).
105.
Fujioka, A. et al. Dynamics of the Ras/ERK MAPK cascade as monitored by
63
fluorescent probes. The Journal of Biological Chemistry 281, 8917–8926 (2006).
106.
Yoon, S. & Seger, R. The extracellular signal-regulated kinase: multiple
substrates regulate diverse cellular functions. Growth Factors 24, 21–44 (2006).
107.
Kumar, A., Rajendran, V., Sethumadhavan, R. & Purohit, R. AKT kinase
pathway: a leading target in cancer research. ScientificWorldJournal 2013,
756134 (2013).
108.
Bardwell, L. & Shah, K. Analysis of mitogen-activated protein kinase activation
and interactions with regulators and substrates. Methods 40, 213–223 (2006).
109.
Lefloch, R., Pouysségur, J. & Lenormand, P. Total ERK1/2 activity regulates cell
proliferation. Cell Cycle (Georgetown, Tex.) 8, 705–711 (2009).
110.
Bos, J. L. ras oncogenes in human cancer: a review. Cancer Research 49, 4682–
4689 (1989).
111.
Davies, H. et al. Mutations of the BRAF gene in human cancer. Nature 417,
949–954 (2002).
112.
Vakiani, E. & Solit, D. B. KRAS and BRAF: drug targets and predictive
biomarkers. J. Pathol. 223, 220–230 (2010).
113.
Garnett, M. J. & Marais, R. Guilty as charged: B-RAF is a human oncogene.
Cancer Cell 6, 313–319 (2004).
114.
Lloyd, A. C. Distinct functions for ERKs? J. Biol. 5, 13 (2006).
115.
Lefloch, R., Pouysségur, J. & Lenormand, P. Single and combined silencing of
ERK1 and ERK2 reveals their positive contribution to growth signaling
depending on their expression levels. Mol. Cell. Biol. 28, 511–527 (2008).
116.
Adams, J. A. Activation Loop Phosphorylation and Catalysis in Protein
Kinases: Is There Functional Evidence for the Autoinhibitor Model? †.
Biochemistry 42, 601–607 (2003).
117.
Haystead, T. A., Dent, P., Wu, J., Haystead, C. M. & Sturgill, T. W. Ordered
phosphorylation of p42mapk by MAP kinase kinase. FEBS Letters 306, 17–22
(1992).
118.
Burack, W. R. & Sturgill, T. W. The activating dual phosphorylation of MAPK
by MEK is nonprocessive. Biochemistry 36, 5929–5933 (1997).
119.
64
Ferrell, J. E. & Bhatt, R. R. Mechanistic studies of the dual phosphorylation of
mitogen-activated protein kinase. The Journal of Biological Chemistry 272,
19008–19016 (1997).
120.
Prowse, C. N. & Lew, J. Mechanism of activation of ERK2 by dual
phosphorylation. The Journal of Biological Chemistry 276, 99–103 (2001).
121.
Coffer, P. J. & Woodgett, J. R. Molecular cloning and characterisation of a novel
putative protein-serine kinase related to the cAMP-dependent and protein kinase
C families. Eur. J. Biochem. 201, 475–481 (1991).
122.
Jones, P. F., Jakubowicz, T., Pitossi, F. J., Maurer, F. & Hemmings, B. A.
Molecular cloning and identification of a serine/threonine protein kinase of the
second-messenger subfamily. Proc. Natl. Acad. Sci. U.S.A. 88, 4171–4175
(1991).
123.
Cheng, J. Q. et al. AKT2, a putative oncogene encoding a member of a
subfamily of protein-serine/threonine kinases, is amplified in human ovarian
carcinomas. Proc. Natl. Acad. Sci. U.S.A. 89, 9267–9271 (1992).
124.
Konishi, H. et al. Molecular cloning and characterization of a new member of the
RAC protein kinase family: association of the pleckstrin homology domain of
three types of RAC protein kinase with protein kinase C subspecies and beta
gamma subunits of G proteins. Biochem. Biophys. Res. Commun. 216, 526–534
(1995).
125.
Bellacosa, A. et al. Structure, expression and chromosomal mapping of c-akt:
relationship to v-akt and its implications. Oncogene 8, 745–754 (1993).
126.
Altomare, D. A. et al. Cloning, chromosomal localization and expression
analysis of the mouse Akt2 oncogene. Oncogene 11, 1055–1060 (1995).
127.
Altomare, D. A., Lyons, G. E., Mitsuuchi, Y., Cheng, J. Q. & Testa, J. R. Akt2
mRNA is highly expressed in embryonic brown fat and the AKT2 kinase is
activated by insulin. Oncogene 16, 2407–2411 (1998).
128.
Franke, T. F., Tartof, K. D. & Tsichlis, P. N. The SH2-like Akt homology (AH)
domain of c-akt is present in multiple copies in the genome of vertebrate and
invertebrate eucaryotes. Cloning and characterization of the Drosophila
melanogaster c-akt homolog Dakt1. Oncogene 9, 141–148 (1994).
129.
Paradis, S. & Ruvkun, G. Caenorhabditis elegans Akt/PKB transduces insulin
65
receptor-like signals from AGE-1 PI3 kinase to the DAF-16 transcription factor.
Genes Dev. 12, 2488–2498 (1998).
130.
Datta, S. R., Brunet, A. & Greenberg, M. E. Cellular survival: a play in three
Akts. Genes Dev. 13, 2905–2927 (1999).
131.
Bellacosa, A., Testa, J. R., Staal, S. P. & Tsichlis, P. N. A retroviral oncogene,
akt, encoding a serine-threonine kinase containing an SH2-like region. Science
254, 274–277 (1991).
132.
Alessi, D. R. et al. Mechanism of activation of protein kinase B by insulin and
IGF-1. EMBO J. 15, 6541–6551 (1996).
133.
Nishida, K., Kaziro, Y. & Satoh, T. Anti-apoptotic function of Rac in
hematopoietic cells. Oncogene 18, 407–415 (1999).
134.
Rameh, L. E. & Cantley, L. C. The role of phosphoinositide 3-kinase lipid
products in cell function. The Journal of Biological Chemistry 274, 8347–8350
(1999).
135.
Frech, M. et al. High affinity binding of inositol phosphates and
phosphoinositides to the pleckstrin homology domain of RAC/protein kinase B
and their influence on kinase activity. The Journal of Biological Chemistry 272,
8474–8481 (1997).
136.
Franke, T. F., Kaplan, D. R., Cantley, L. C. & Toker, A. Direct regulation of the
Akt proto-oncogene product by phosphatidylinositol-3,4-bisphosphate. Science
275, 665–668 (1997).
137.
Franke, T. F. et al. The protein kinase encoded by the Akt proto-oncogene is a
target of the PDGF-activated phosphatidylinositol 3-kinase. Cell 81, 727–736
(1995).
138.
Alessi, D. R. et al. Characterization of a 3-phosphoinositide-dependent protein
kinase which phosphorylates and activates protein kinase Balpha. Curr. Biol. 7,
261–269 (1997).
139.
Andjelkovic, M. et al. Role of translocation in the activation and function of
protein kinase B. The Journal of Biological Chemistry 272, 31515–31524 (1997).
140.
Datta, K. et al. AH/PH domain-mediated interaction between Akt molecules and
its potential role in Akt regulation. Mol. Cell. Biol. 15, 2304–2310 (1995).
66
141.
Kohn, A. D., Takeuchi, F. & Roth, R. A. Akt, a pleckstrin homology domain
containing kinase, is activated primarily by phosphorylation. The Journal of
Biological Chemistry 271, 21920–21926 (1996).
142.
Staal, S. P. Molecular cloning of the akt oncogene and its human homologues
AKT1 and AKT2: amplification of AKT1 in a primary human gastric
adenocarcinoma. Proc. Natl. Acad. Sci. U.S.A. 84, 5034–5037 (1987).
143.
Staal, S. P., Hartley, J. W. & Rowe, W. P. Isolation of transforming murine
leukemia viruses from mice with a high incidence of spontaneous lymphoma.
Proc. Natl. Acad. Sci. U.S.A. 74, 3065–3067 (1977).
144.
Burgering, B. M. & Coffer, P. J. Protein kinase B (c-Akt) in
phosphatidylinositol-3-OH kinase signal transduction. Nature 376, 599–602
(1995).
145.
Dudek, H. et al. Regulation of neuronal survival by the serine-threonine protein
kinase Akt. Science 275, 661–665 (1997).
146.
Coffer, P. J., Jin, J. & Woodgett, J. R. Protein kinase B (c-Akt): a multifunctional
mediator of phosphatidylinositol 3-kinase activation. Biochem. J. 335 ( Pt 1), 1–
13 (1998).
147.
Pignataro, O. P. & Ascoli, M. Epidermal growth factor increases the labeling of
phosphatidylinositol 3,4-bisphosphate in MA-10 Leydig tumor cells. The Journal
of Biological Chemistry 265, 1718–1723 (1990).
148.
Cantley, L. C. The phosphoinositide 3-kinase pathway. Science 296, 1655–1657
(2002).
149.
Yuan, T. L. & Cantley, L. C. PI3K pathway alterations in cancer: variations on a
theme. Oncogene 27, 5497–5510 (2008).
150.
Engelman, J. A., Luo, J. & Cantley, L. C. The evolution of phosphatidylinositol
3-kinases as regulators of growth and metabolism. Nat. Rev. Genet. 7, 606–619
(2006).
151.
Crabbe, T., Welham, M. J. & Ward, S. G. The PI3K inhibitor arsenal: choose
your weapon! Trends Biochem. Sci. 32, 450–456 (2007).
152.
Wymann, M. P. et al. Wortmannin inactivates phosphoinositide 3-kinase by
covalent modification of Lys-802, a residue involved in the phosphate transfer
67
reaction. Mol. Cell. Biol. 16, 1722–1733 (1996).
153.
Walker, E. H. et al. Structural determinants of phosphoinositide 3-kinase
inhibition by wortmannin, LY294002, quercetin, myricetin, and staurosporine.
Molecular Cell 6, 909–919 (2000).
154.
Pelicci, G. et al. A novel transforming protein (SHC) with an SH2 domain is
implicated in mitogenic signal transduction. Cell 70, 93–104 (1992).
155.
Wills, M. K. B. & Jones, N. Teaching an old dogma new tricks: twenty years of
Shc adaptor signalling. Biochem. J. 447, 1–16 (2012).
156.
Ventura, A., Luzi, L., Pacini, S., Baldari, C. T. & Pelicci, P. G. The p66Shc
longevity gene is silenced through epigenetic modifications of an alternative
promoter. The Journal of Biological Chemistry 277, 22370–22376 (2002).
157.
Lai, K. M. & Pawson, T. The ShcA phosphotyrosine docking protein sensitizes
cardiovascular signaling in the mouse embryo. Genes Dev. 14, 1132–1145
(2000).
158.
Migliaccio, E. et al. Opposite effects of the p52shc/p46shc and p66shc splicing
isoforms on the EGF receptor-MAP kinase-fos signalling pathway. EMBO J. 16,
706–716 (1997).
159.
Ravichandran, K. S. Signaling via Shc family adapter proteins. Oncogene 20,
6322–6330 (2001).
160.
Bhattacharyya, R. P., Reményi, A., Yeh, B. J. & Lim, W. A. Domains, motifs,
and scaffolds: the role of modular interactions in the evolution and wiring of cell
signaling circuits. Annu. Rev. Biochem. 75, 655–680 (2006).
161.
Luzi, L., Confalonieri, S., Di Fiore, P. P. & Pelicci, P. G. Evolution of Shc
functions from nematode to human. Curr. Opin. Genet. Dev. 10, 668–674 (2000).
162.
van der Geer, P., Wiley, S., Gish, G. D. & Pawson, T. The Shc adaptor protein is
highly phosphorylated at conserved, twin tyrosine residues (Y239/240) that
mediate protein-protein interactions. Curr. Biol. 6, 1435–1444 (1996).
163.
Charest, A., Wagner, J., Jacob, S., McGlade, C. J. & Tremblay, M. L.
Phosphotyrosine-independent binding of SHC to the NPLH sequence of murine
protein-tyrosine phosphatase-PEST. Evidence for extended phosphotyrosine
binding/phosphotyrosine interaction domain recognition specificity. The Journal
68
of Biological Chemistry 271, 8424–8429 (1996).
164.
Habib, T., Herrera, R. & Decker, S. J. Activators of protein kinase C stimulate
association of Shc and the PEST tyrosine phosphatase. The Journal of Biological
Chemistry 269, 25243–25246 (1994).
165.
Gu, J. et al. Shc and FAK differentially regulate cell motility and directionality
modulated by PTEN. J. Cell Biol. 146, 389–403 (1999).
166.
Schneider, E. et al. Migration of renal tumor cells depends on dephosphorylation
of Shc by PTEN. Int. J. Oncol. 38, 823–831 (2011).
167.
Ugi, S., Imamura, T., Ricketts, W. & Olefsky, J. M. Protein phosphatase 2A
forms a molecular complex with Shc and regulates Shc tyrosine phosphorylation
and downstream mitogenic signaling. Mol. Cell. Biol. 22, 2375–2387 (2002).
168.
Kraut-Cohen, J., Muller, W. J. & Elson, A. Protein-tyrosine phosphatase epsilon
regulates Shc signaling in a kinase-specific manner: increasing coherence in
tyrosine phosphatase signaling. The Journal of Biological Chemistry 283, 4612–
4621 (2008).
169.
Zhou, M. M. et al. Structure and ligand recognition of the phosphotyrosine
binding domain of Shc. Nature 378, 584–592 (1995).
170.
Ravichandran, K. S. et al. Evidence for a requirement for both phospholipid and
phosphotyrosine binding via the Shc phosphotyrosine-binding domain in vivo.
Mol. Cell. Biol. 17, 5540–5549 (1997).
171.
Dengjel, J. et al. Quantitative proteomic assessment of very early cellular
signaling events. Nat. Biotechnol. 25, 566–568 (2007).
172.
Dankort, D. L., Wang, Z., Blackmore, V., Moran, M. F. & Muller, W. J. Distinct
tyrosine autophosphorylation sites negatively and positively modulate neumediated transformation. Mol. Cell. Biol. 17, 5410–5425 (1997).
173.
Dankort, D. et al. Grb2 and Shc Adapter Proteins Play Distinct Roles in Neu
(ErbB-2)-Induced Mammary Tumorigenesis: Implications for Human Breast
Cancer. Mol. Cell. Biol. 21, 1540–1551 (2001).
174.
Ursini-Siegel, J. et al. ShcA signalling is essential for tumour progression in
mouse models of human breast cancer. EMBO J. 27, 910–920 (2008).
175.
Oksvold, M. P., Skarpen, E., Lindeman, B., Roos, N. & Huitfeldt, H. S.
69
Immunocytochemical localization of Shc and activated EGF receptor in early
endosomes after EGF stimulation of HeLa cells. J. Histochem. Cytochem. 48,
21–33 (2000).
176.
Sakaguchi, K., Okabayashi, Y. & Kasuga, M. Shc mediates ligand-induced
internalization of epidermal growth factor receptors. Biochem. Biophys. Res.
Commun. 282, 1154–1160 (2001).
177.
Sato, K. et al. Adaptor protein Shc undergoes translocation and mediates upregulation of the tyrosine kinase c-Src in EGF-stimulated A431 cells. Genes
Cells 5, 749–764 (2000).
178.
Suenaga, A. et al. Tyr-317 phosphorylation increases Shc structural rigidity and
reduces coupling of domain motions remote from the phosphorylation site as
revealed by molecular dynamics simulations. The Journal of Biological
Chemistry 279, 4657–4662 (2004).
179.
Suenaga, A. et al. Molecular dynamics simulations reveal that Tyr-317
phosphorylation reduces Shc binding affinity for phosphotyrosyl residues of
epidermal growth factor receptor. Biophys. J. 96, 2278–2288 (2009).
180.
Okabayashi, Y. et al. Tyrosines 1148 and 1173 of activated human epidermal
growth factor receptors are binding sites of Shc in intact cells. The Journal of
Biological Chemistry 269, 18674–18678 (1994).
181.
Gotoh, N. et al. Epidermal growth factor-receptor mutant lacking the
autophosphorylation sites induces phosphorylation of Shc protein and ShcGrb2/ASH association and retains mitogenic activity. Proc. Natl. Acad. Sci.
U.S.A. 91, 167–171 (1994).
182.
Maegawa, M. et al. Epidermal growth factor receptor lacking C-terminal
autophosphorylation sites retains signal transduction and high sensitivity to
epidermal growth factor receptor tyrosine kinase inhibitor. Cancer Sci. 100, 552–
557 (2009).
183.
Batzer, A. G., Rotin, D., Ureña, J. M., Skolnik, E. Y. & Schlessinger, J.
Hierarchy of binding sites for Grb2 and Shc on the epidermal growth factor
receptor. Mol. Cell. Biol. 14, 5192–5201 (1994).
184.
70
Sakaguchi, K. et al. Shc phosphotyrosine-binding domain dominantly interacts
with epidermal growth factor receptors and mediates Ras activation in intact
cells. Mol. Endocrinol. 12, 536–543 (1998).
185.
Batzer, A. G., Blaikie, P., Nelson, K., Schlessinger, J. & Margolis, B. The
phosphotyrosine interaction domain of Shc binds an LXNPXY motif on the
epidermal growth factor receptor. Mol. Cell. Biol. 15, 4403–4409 (1995).
186.
Soler, C., Beguinot, L. & Carpenter, G. Individual epidermal growth factor
receptor autophosphorylation sites do not stringently define association motifs
for several SH2-containing proteins. The Journal of Biological Chemistry 269,
12320–12324 (1994).
187.
Thomas, D. & Bradshaw, R. A. Differential utilization of ShcA tyrosine residues
and functional domains in the transduction of epidermal growth factor-induced
mitogen-activated protein kinase activation in 293T cells and nerve growth
factor-induced neurite outgrowth in PC12 cells. Identification of a new
Grb2.Sos1 binding site. The Journal of Biological Chemistry 272, 22293–22299
(1997).
188.
Gong, Y. & Zhao, X. Shc-dependent pathway is redundant but dominant in
MAPK cascade activation by EGF receptors: a modeling inference. FEBS Letters
554, 467–472 (2003).
189.
Holgado-Madruga, M., Emlet, D. R., Moscatello, D. K., Godwin, A. K. & Wong,
A. J. A Grb2-associated docking protein in EGF- and insulin-receptor signaling.
Nature 378, 560–565 (1996).
190.
Gu, H. & Neel, B. G. The ‘Gab’ in signal transduction. Trends Cell Biol. 13,
122–130 (2003).
191.
Schaeper, U. et al. Coupling of Gab1 to c-Met, Grb2, and Shp2 Mediates
Biological Responses. J. Cell Biol. 149, 1419–1432 (2000).
192.
Lock, L. S., Royal, I., Naujokas, M. A. & Park, M. Identification of an atypical
Grb2 carboxyl-terminal SH3 domain binding site in Gab docking proteins reveals
Grb2-dependent and -independent recruitment of Gab1 to receptor tyrosine
kinases. The Journal of Biological Chemistry 275, 31536–31545 (2000).
193.
Nguyen, L. et al. Association of the Multisubstrate Docking Protein Gab1 with
the Hepatocyte Growth Factor Receptor Requires a Functional Grb2 Binding Site
71
Involving Tyrosine 1356*. Journal of Biological Chemistry 272, 20811–20819
(1997).
194.
Bardelli, A., Longati, P., Gramaglia, D., Stella, M. C. & Comoglio, P. M. Gab1
coupling to the HGF/Met receptor multifunctional docking site requires binding
of Grb2 and correlates with the transforming potential. Oncogene 15, 3103–3111
(1997).
195.
Hibi, M. & Hirano, T. Gab-family adapter molecules in signal transduction of
cytokine and growth factor receptors, and T and B cell antigen receptors. Leuk.
Lymphoma 37, 299–307 (2000).
196.
Liu, Y. & Rohrschneider, L. R. The gift of Gab. FEBS Letters 515, 1–7 (2002).
197.
Schutzman, J. L. et al. The Caenorhabditis elegans EGL-15 Signaling Pathway
Implicates a DOS-Like Multisubstrate Adaptor Protein in Fibroblast Growth
Factor Signal Transduction. Mol. Cell. Biol. 21, 8104–8116 (2001).
198.
Abbeyquaye, T., Riesgo-Escovar, J., Raabe, T. & Thackeray, J. R. Evolution of
Gab family adaptor proteins. Gene 311, 43–50 (2003).
199.
Nishida, K. et al. Gab-Family Adaptor Proteins Act Downstream of Cytokine
and Growth Factor Receptors and T- and B-Cell Antigen Receptors. Blood 93,
1809–1816 (1999).
200.
Garcia-Guzman, M., Dolfi, F., Zeh, K. & Vuori, K. Met-induced JNK activation
is mediated by the adapter protein Crk and correlates with the Gab1 ± Crk
signaling complex formation. Oncogene 7775–7786 (1999).
201.
Crouin, C., Arnaud, M., Gesbert, F., Camonis, J. & Bertoglio, J. A yeast twohybrid study of human p97/Gab2 interactions with its SH2 domain-containing
binding partners. FEBS Letters 148–153 (2001).
202.
Sakkab, D. et al. Signaling of Hepatocyte Growth Factor/Scatter Factor (HGF) to
the Small GTPase Rap1 via the Large Docking Protein Gab1 and the Adapter
Protein CRKL. Journal of Biological Chemistry 275, 10772–10778 (2000).
203.
Lecoq-Lafon, C. et al. Erythropoietin Induces the Tyrosine Phosphorylation of
GAB1 and Its Association With SHC, SHP2, SHIP, and Phosphatidylinositol 3Kinase. Blood 2578–2585 (1999).
204.
72
Lee, A. W. M. & States, D. J. Both Src-Dependent and -Independent
Mechanisms MediatePhosphatidylinositol 3-Kinase Regulation of ColonyStimulatingFactor 1-Activated Mitogen-Activated Protein Kinasesin Myeloid
Progenitors. Mol. Cell. Biol. 20, 6779–6798 (2000).
205.
Yamasaki, S. Docking Protein Gab2 Is Phosphorylated by ZAP-70 and
Negatively Regulates T Cell Receptor Signaling by Recruitment of Inhibitory
Molecules. Journal of Biological Chemistry 276, 45175–45183 (2001).
206.
Guo, L. et al. Studies of ligand-induced site-specific phosphorylation of
epidermal growth factor receptor. J. Am. Soc. Mass Spectrom. 14, 1022–1031
(2003).
207.
Nishida, K. & Hirano, T. The role of Gab family scaffolding adapter proteins in
the signal transduction of cytokine and growth factor receptors. Cancer Sci. 94,
1029–1033 (2003).
208.
Lowenstein, E. J. et al. The SH2 and SH3 Domain-Containing Protein GRB2
Links Receptor Tyrosine Kinases to ras Signaling. Cell 70, 431–442 (1992).
209.
Buday, L. & Downward, J. Epidermal Growth Factor Regulates p21lilsthrough
the Formation of a Complex of Receptor, Grb2Adapter Protein. Cell 73, 611–620
(1993).
210.
Gale, N. W., Kaplan, S., Lowenstein, E. J., Schlessinger, J. & Bar-Sagi, D. Grb2
mediates the EGF-dependent activation of guanine nucleotide exchange on Ras.
Nature 363, 88–92 (1993).
211.
Simon, M. A., Bowtell, D. D., Dodson, G. S., Laverty, T. R. & Rubin, G. M.
Rasl and a Putative Guanine Nucleotide Exchange Factor Perform Crucial Steps
in Signalingby the Sevenless Protein Tyrosine Kinase. Cell 67, 701–716 (1991).
212.
Li, N. et al. Guanine-nucleotide-releasing factor hSos1 binds to Grb2 and links
receptor tyrosine kinases to Ras signalling. Nature 363, 85–87 (1993).
213.
Agazie, Y. M., Movilla, N., Ischenko, I. & Hayman, M. J. The phosphotyrosine
phosphatase SHP2 is a critical mediator of transformation induced by the
oncogenic fibroblast growth factor receptor 3. Oncogene 22, 6909–6918 (2003).
214.
Skolnik, E. Y. et al. The SH2/SH3 domain-containing protein. EMBO J. 12,
1929–1936 (1993).
215.
Lin, C.-C. et al. Inhibition of Basal FGF Receptor Signaling by Dimeric Grb2.
73
Cell 149, 1514–1524 (2012).
216.
Belov, A. A. & Mohammadi, M. Grb2, a Double-Edged Sword of Receptor
Tyrosine Kinase Signaling. Sci Signal 5, pe49–pe49 (2012).
217.
Sun, X. J. et al. Structure of the insulin receptor substrate IRS-1 defines a unique
signal transduction protein. Nature 352, 73–77 (1991).
218.
Sun, X. J. et al. Role of IRS-2 in insulin and cytokine signalling. Nature 377,
173–177 (1995).
219.
Gibson, S. L., Ma, Z. & Shaw, L. M. Divergent Roles for IRS-1 and IRS-2 in
Breast Cancer Metastatis. Cell Cycle (Georgetown, Tex.) 6, 631–637 (2007).
220.
White, M. F. The IRS-signalling system: a network of docking proteins that
mediate insulin action. Mol. Cell. Biochem. 182, 3–11 (1998).
221.
White, M. F. The IRS-signalling system in insulin and cytokine action. Philos.
Trans. R. Soc. Lond., B, Biol. Sci. 351, 181–189 (1996).
222.
Gustafson, T. A., He, W., Craparo, A., Schaub, C. D. & O'Neill, T. J.
Phosphotyrosine-dependent interaction of SHC and insulin receptor substrate 1
with the NPEY motif of the insulin receptor via a novel non-SH2 domain. Mol.
Cell. Biol. 15, 2500–2508 (1995).
223.
Shaw, L. M. Identification of insulin receptor substrate 1 (IRS-1) and IRS-2 as
signaling intermediates in the alpha6beta4 integrin-dependent activation of
phosphoinositide 3-OH kinase and promotion of invasion. Mol. Cell. Biol. 21,
5082–5093 (2001).
224.
Myers, M. G. et al. IRS-1 activates phosphatidylinositol 3'-kinase by associating
with src homology 2 domains of p85. Proc. Natl. Acad. Sci. U.S.A. 89, 10350–
10354 (1992).
225.
Myers, M. G. et al. Role of IRS-1-GRB-2 complexes in insulin signaling. Mol.
Cell. Biol. 14, 3577–3587 (1994).
226.
Zhang, X. et al. Multiple signaling pathways are activated during insulin-like
growth factor-I (IGF-I) stimulated breast cancer cell migration. Breast Cancer
Res. Treat. 93, 159–168 (2005).
227.
Withers, D. J. et al. Disruption of IRS-2 causes type 2 diabetes in mice. Nature
391, 900–904 (1998).
74
228.
Schubert, M. et al. Insulin receptor substrate-2 deficiency impairs brain growth
and promotes tau phosphorylation. J. Neurosci. 23, 7084–7092 (2003).
229.
Withers, D. J. et al. Irs-2 coordinates Igf-1 receptor-mediated beta-cell
development and peripheral insulin signalling. Nat. Genet. 23, 32–40 (1999).
230.
Burks, D. J. et al. IRS-2 pathways integrate female reproduction and energy
homeostasis. Nature 407, 377–382 (2000).
231.
Nagle, J. A., Ma, Z., Byrne, M. A., White, M. F. & Shaw, L. M. Involvement of
insulin receptor substrate 2 in mammary tumor metastasis. Mol. Cell. Biol. 24,
9726–9735 (2004).
232.
Ma, Z. et al. Suppression of insulin receptor substrate 1 (IRS-1) promotes
mammary tumor metastasis. Mol. Cell. Biol. 26, 9338–9351 (2006).
233.
Freeman, R. M., Plutzky, J. & Neel, B. G. Identification of a human src
homology 2-containing protein-tyrosine-phosphatase: a putative homolog of
Drosophila corkscrew. Proc. Natl. Acad. Sci. U.S.A. 89, 11239–11243 (1992).
234.
Neel, B. G., Gu, H. & Pao, L. The ‘Shp’ing news: SH2 domain-containing
tyrosine phosphatases in cell signaling. Trends Biochem. Sci. 28, 284–293
(2003).
235.
Craggs, G. & Kellie, S. A functional nuclear localization sequence in the Cterminal domain of SHP-1. The Journal of Biological Chemistry 276, 23719–
23725 (2001).
236.
Yang, W., Tabrizi, M. & Yi, T. A bipartite NLS at the SHP-1 C-terminus
mediates cytokine-induced SHP-1 nuclear localization in cell growth control.
Blood Cells Mol. Dis. 28, 63–74 (2002).
237.
Qu, C. K. The SHP-2 tyrosine phosphatase: signaling mechanisms and biological
functions. Cell Res. 10, 279–288 (2000).
238.
Li, W. et al. A new function for a phosphotyrosine phosphatase: linking GRB2Sos to a receptor tyrosine kinase. Mol. Cell. Biol. 14, 509–517 (1994).
239.
Xiao, S. et al. Syp (SH-PTP2) is a positive mediator of growth factor-stimulated
mitogenic signal transduction. The Journal of Biological Chemistry 269, 21244–
21248 (1994).
240.
Van Vactor, D., O'Reilly, A. M. & Neel, B. G. Genetic analysis of protein
75
tyrosine phosphatases. Curr. Opin. Genet. Dev. 8, 112–126 (1998).
241.
Feng, G. S. Shp-2 tyrosine phosphatase: signaling one cell or many.
Experimental Cell Research 253, 47–54 (1999).
242.
Barford, D. & Neel, B. G. Revealing mechanisms for SH2 domain mediated
regulation of the protein tyrosine phosphatase SHP-2. Structure 6, 249–254
(1998).
243.
Hof, P., Pluskey, S., Dhe-Paganon, S., Eck, M. J. & Shoelson, S. E. Crystal
structure of the tyrosine phosphatase SHP-2. Cell 92, 441–450 (1998).
244.
Langdon, W. Y., Hartley, J. W., Klinken, S. P., Ruscetti, S. K. & Morse, H. C. vcbl, an oncogene from a dual-recombinant murine retrovirus that induces early
B-lineage lymphomas. Proc. Natl. Acad. Sci. U.S.A. 86, 1168–1172 (1989).
245.
Joazeiro, C. A. et al. The tyrosine kinase negative regulator c-Cbl as a RINGtype, E2-dependent ubiquitin-protein ligase. Science 286, 309–312 (1999).
246.
Waterman, H., Levkowitz, G., Alroy, I. & Yarden, Y. The RING finger of c-Cbl
mediates desensitization of the epidermal growth factor receptor. The Journal of
Biological Chemistry 274, 22151–22154 (1999).
247.
Swaminathan, G. & Tsygankov, A. Y. The Cbl family proteins: ring leaders in
regulation of cell signaling. J. Cell. Physiol. 209, 21–43 (2006).
248.
Feshchenko, E. A. et al. TULA: an SH3- and UBA-containing protein that binds
to c-Cbl and ubiquitin. Oncogene 23, 4690–4706 (2004).
249.
Kowanetz, K. et al. Suppressors of T-cell receptor signaling Sts-1 and Sts-2 bind
to Cbl and inhibit endocytosis of receptor tyrosine kinases. The Journal of
Biological Chemistry 279, 32786–32795 (2004).
250.
Lupher, M. L. et al. Cbl-mediated negative regulation of the Syk tyrosine kinase.
A critical role for Cbl phosphotyrosine-binding domain binding to Syk
phosphotyrosine 323. The Journal of Biological Chemistry 273, 35273–35281
(1998).
251.
Meng, W., Sawasdikosol, S., Burakoff, S. J. & Eck, M. J. Structure of the aminoterminal domain of Cbl complexed to its binding site on ZAP-70 kinase. Nature
398, 84–90 (1999).
252.
76
Peschard, P., Ishiyama, N., Lin, T., Lipkowitz, S. & Park, M. A conserved DpYR
motif in the juxtamembrane domain of the Met receptor family forms an atypical
c-Cbl/Cbl-b tyrosine kinase binding domain binding site required for suppression
of oncogenic activation. The Journal of Biological Chemistry 279, 29565–29571
(2004).
253.
Levkowitz, G. et al. Ubiquitin ligase activity and tyrosine phosphorylation
underlie suppression of growth factor signaling by c-Cbl/Sli-1. Molecular Cell 4,
1029–1040 (1999).
254.
Thien, C. B., Walker, F. & Langdon, W. Y. RING finger mutations that abolish
c-Cbl-directed polyubiquitination and downregulation of the EGF receptor are
insufficient for cell transformation. Molecular Cell 7, 355–365 (2001).
255.
Keane, M. M., Rivero-Lezcano, O. M., Mitchell, J. A., Robbins, K. C. &
Lipkowitz, S. Cloning and characterization of cbl-b: a SH3 binding protein with
homology to the c-cbl proto-oncogene. Oncogene 10, 2367–2377 (1995).
256.
Andoniou, C. E., Thien, C. B. & Langdon, W. Y. The two major sites of cbl
tyrosine phosphorylation in abl-transformed cells select the crkL SH2 domain.
Oncogene 12, 1981–1989 (1996).
257.
Hunter, S., Burton, E. A., Wu, S. C. & Anderson, S. M. Fyn associates with Cbl
and phosphorylates tyrosine 731 in Cbl, a binding site for phosphatidylinositol 3kinase. The Journal of Biological Chemistry 274, 2097–2106 (1999).
258.
Grossmann, A. H. et al. Catalytic domains of tyrosine kinases determine the
phosphorylation sites within c-Cbl. FEBS Letters 577, 555–562 (2004).
259.
Kassenbrock, C. K. & Anderson, S. M. Regulation of ubiquitin protein ligase
activity in c-Cbl by phosphorylation-induced conformational change and
constitutive activation by tyrosine to glutamate point mutations. The Journal of
Biological Chemistry 279, 28017–28027 (2004).
260.
Levkowitz, G. et al. c-Cbl/Sli-1 regulates endocytic sorting and ubiquitination of
the epidermal growth factor receptor. Genes Dev. 12, 3663–3674 (1998).
261.
Haglund, K. et al. Multiple monoubiquitination of RTKs is sufficient for their
endocytosis and degradation. Nat. Cell Biol. 5, 461–466 (2003).
262.
de Melker, A. A., van der Horst, G., Calafat, J., Jansen, H. & Borst, J. c-Cbl
ubiquitinates the EGF receptor at the plasma membrane and remains receptor
77
associated throughout the endocytic route. Journal of Cell Science 114, 2167–
2178 (2001).
263.
de Melker, A. A., van der Horst, G. & Borst, J. Ubiquitin ligase activity of c-Cbl
guides the epidermal growth factor receptor into clathrin-coated pits by two
distinct modes of Eps15 recruitment. The Journal of Biological Chemistry 279,
55465–55473 (2004).
264.
de Melker, A. A., van der Horst, G. & Borst, J. c-Cbl directs EGF receptors into
an endocytic pathway that involves the ubiquitin-interacting motif of Eps15.
Journal of Cell Science 117, 5001–5012 (2004).
265.
Katzmann, D. J., Odorizzi, G. & Emr, S. D. Receptor downregulation and
multivesicular-body sorting. Nat Rev Mol Cell Biol 3, 893–905 (2002).
266.
Longva, K. E. et al. Ubiquitination and proteasomal activity is required for
transport of the EGF receptor to inner membranes of multivesicular bodies. J.
Cell Biol. 156, 843–854 (2002).
267.
Waterman, H. et al. A mutant EGF-receptor defective in ubiquitylation and
endocytosis unveils a role for Grb2 in negative signaling. EMBO J. 21, 303–313
(2002).
268.
Jiang, X. & Sorkin, A. Epidermal growth factor receptor internalization through
clathrin-coated pits requires Cbl RING finger and proline-rich domains but not
receptor polyubiquitylation. Traffic 4, 529–543 (2003).
269.
Jiang, X., Huang, F., Marusyk, A. & Sorkin, A. Grb2 regulates internalization of
EGF receptors through clathrin-coated pits. Mol. Biol. Cell 14, 858–870 (2003).
270.
Grishin, A. et al. Involvement of Shc and Cbl-PI 3-kinase in Lyn-dependent
proliferative signaling pathways for G-CSF. Oncogene 19, 97–105 (2000).
271.
Nita-Lazar, A., Saito-Benz, H. & White, F. M. Quantitative phosphoproteomics
by mass spectrometry: past, present, and future. Proteomics 8, 4433–4443
(2008).
272.
Hunt, D. F., Yates, J. R., Shabanowitz, J., Winston, S. & Hauer, C. R. Protein
sequencing by tandem mass spectrometry. Proc. Natl. Acad. Sci. U.S.A. 83,
6233–6237 (1986).
273.
78
Bell, A. W. et al. A HUPO test sample study reveals common problems in mass
spectrometry–based proteomics. Nat Meth 6, 423–430 (2009).
274.
Siuzdak, G. Mass Spectrometry for Biotechnology. (Academic Press, 1996).
275.
Makarov, A. Electrostatic Axially Harmonic Orbital Trapping: A HighPerformance Technique of Mass Analysis. Anal. Chem. 72, 1156–1162 (2000).
276.
Hu, Q. et al. The Orbitrap: a new mass spectrometer. J. Mass Spectrom. 40, 430–
443 (2005).
277.
Zubarev, R. A. & Makarov, A. Orbitrap Mass Spectrometry. Anal. Chem. 85,
5288–5296 (2013).
278.
Liebler, D. C. & Zimmerman, L. J. Targeted quantification of proteins by mass
spectrometry. Biochemistry 52, 3797–3806 (2013).
279.
de Hoffman, E. & Stroobant, V. Mass Spectrometry. (John Wiley & Sons, Inc.).
280.
Mann, M., Kulak, N. A., Nagaraj, N. & Cox, J. The Coming Age of Complete,
Accurate, and Ubiquitous Proteomes. Molecular Cell 49, 583–590 (2013).
281.
Robles, M. S., Cox, J. & Mann, M. In-Vivo Quantitative Proteomics Reveals a
Key Contribution of Post-Transcriptional Mechanisms to the Circadian
Regulation of Liver Metabolism. PLoS Genet 10, e1004047 (2014).
282.
White, F. M. The potential cost of high-throughput proteomics. Sci Signal 4, pe8
(2011).
283.
Wolf-Yadlin, A., Hautaniemi, S., Lauffenburger, D. A. & White, F. M. Multiple
reaction monitoring for robust quantitative proteomic analysis of cellular
signaling networks. Proc. Natl. Acad. Sci. U.S.A. 104, 5860–5865 (2007).
284.
Klammer, A. A., Reynolds, S. M., Bilmes, J. A., MacCoss, M. J. & Noble, W. S.
Modeling peptide fragmentation with dynamic Bayesian networks for peptide
identification. Bioinformatics 24, i348–56 (2008).
285.
Elias, J. E., Haas, W., Faherty, B. K. & Gygi, S. P. Comparative evaluation of
mass spectrometry platforms used in large-scale proteomics investigations. Nat
Meth 2, 667–675 (2005).
286.
Käll, L., Storey, J. D., MacCoss, M. J. & Noble, W. S. Assigning significance to
peptides identified by tandem mass spectrometry using decoy databases. J.
Proteome Res. 7, 29–34 (2008).
287.
Inge de Dobbeleer, S. E. J. G. H. J. H. A. M. D. C. T. F. S. D. G. The
79
Determination of Anabolic Steroids in Human Urine using the TSQ Quantum
XLS. 1–4 (2012).
288.
Lange, V., Picotti, P., Domon, B. & Aebersold, R. Selected reaction monitoring
for quantitative proteomics: a tutorial. Mol Syst Biol 4, (2008).
289.
Ross, P. L. Multiplexed Protein Quantification in Saccharomyces cerevisiae
Using Amine-reactive Isobaric Tagging Reagents. Molecular & Cellular
Proteomics 3, 1154–1169 (2004).
290.
Thompson, A. et al. Tandem Mass Tags: A Novel Quantification Strategy for
Comparative Analysis of Complex Protein Mixtures by MS/MS. Anal. Chem. 75,
1895–1904 (2003).
291.
Dayon, L. & Sanchez, J.-C. in Methods in Molecular Biology 893, 115–127
(Humana Press, 2012).
292.
Ow, S. Y., Salim, M., Noirel, J., Evans, C. & Wright, P. C. Minimising iTRAQ
ratio compression through understanding LC-MS elution dependence and highresolution HILIC fractionation. Proteomics 11, 2341–2346 (2011).
293.
Mann, M. Functional and quantitative proteomics using SILAC. Nat Rev Mol
Cell Biol 7, 952–958 (2006).
294.
Wang, W. et al. Quantification of proteins and metabolites by mass spectrometry
without isotopic labeling or spiked standards. Anal. Chem. 75, 4818–4826
(2003).
80
2
Computer Assisted Manual Validation (CAMV) 2.1 Introduction Recent advances in mass spectrometry technologies have ushered in a new era of highcontent, high-resolution proteomic datasets. Acquisition of hundreds of thousands of
tandem mass spectra (MS/MS spectra) in a single analysis is now routine, and by
coupling high-speed data acquisition to sample pre-fractionation, millions of MS/MS
spectra can be generated from the analysis of a single biological sample. Despite the
technological advances that have enabled acquisition of these massive datasets, tools to
accurately identify the peptides and post-translational modification (PTM) sites defined
by these tandem mass spectra have evolved less rapidly.
In a typical workflow, MS/MS spectra are searched, with database search algorithms such
as Sequest1, MASCOT2, XTandem3, or Andromeda4, against protein databases to
generate putative peptide and/or PTM identifications for each MS/MS spectrum. Each of
these algorithms relies on a scoring system which weights a variety of parameters,
including the mass accuracy of the precursor ion m/z and the percentage and/or sequence
of fragment ions matching to the theoretical mass and fragmentation pattern of the
putative peptide identification. Due to a variety of factors, including the intensity,
peptide sequence (including PTMs), complexity of the sample, and fragmentation
method, the MS/MS spectra vary greatly in terms of their quality, as defined by their
signal-to-noise and complexity. This variation in quality leads to a wide difference in the
searching algorithm scores for putative peptide matches. Currently, there are no set
‘thresholds’ or ‘rules’ for determining whether a particular peptide identification is
correct, and each database search algorithm weights aspects of the identification
differently. With potentially millions of MS/MS spectra per sample, the challenge of
sorting through the putative identifications to determine the accuracy of each assignment
is monumental, yet is of utmost importance for correct determination of the components
within the biological sample.
81
Figure 2-­‐1: Sample complexity prevents identification of peptides from complex samples based solely on accurate precursor mass measurements. Precursor masses from all tryptic fragments from the Human 2009 database with charge states +2 to +4 that fall between m/z 350 and 1500 were considered. Precursor masses were binned into windows generated at four different resolutions. As the number of peptides per bin increases the percentage of total peptides accounted for increases. At 10ppm a vanishingly small percentage of peptides are the sole occupant of their bin making it nearly impossible to accurately identify peptides based on their precursor mass alone. At 1ppm this figure is improved to roughly 5%. This problem is exacerbated when post-­‐translational modifications and missed or non-­‐
tryptic cleavages are included. The difficulty of accurately identifying a peptide defined by a given tandem mass
spectrum can be exemplified when one considers how much weight should be given to
mass accuracy of the measured precursor ion m/z compared to the theoretical m/z of the
putative peptide. As the accuracy of the measured mass improves, the number of
potential peptides from a given database matching to that mass decreases significantly.
However, mass accuracy alone is typically not sufficient for identification. Figure 2-­‐1
illustrates the issue of relying solely on peptide precursor mass to confirm identification.
All tryptic fragments from proteins in the Human 2009 proteome database were in-silico
digested and binned based on different accuracies. At 10 ppm, very few peptides are the
82
sole occupant of their m/z bin. At 1 ppm, roughly 5% of peptides are uniquely
identifiable from their precursor m/z. The problem becomes more daunting when the
database increases in complexity to reflect the complexity found in biological samples:
the database should contain missed cleavages, non-tryptic cleavages, and at least the most
common dozen of the several hundred potential post-translational or chemical
modifications, as all of these are realistic possibilities for any given peptide. Searches
performed against a database of this size and complexity would require massive
computational resources. Searching algorithms fight an uphill battle against
combinatorial explosion as they seek to balance runtime considerations against erroneous
exclusion of relevant peptides. Unfortunately, searching against an incomplete database
can lead to false positive identifications, simply because the true assignments are not
contained within the search space, and therefore the next best match will automatically be
reported. Distinguishing between high scoring false positives associated with the ‘next
best match’ and true positives is critical, especially given the ultimate goal of utilizing
these peptide and PTM assignments to inform biological experimental design5.
2.1.1
Statistical Approaches to Assessing Quality Currently, the most common approaches to assessing the validity of a given set of peptide
assignments are based on statistical analyses of the likelihood of incorrect assignments,
calculated as either a false-positive or false-discovery rate (FDR). In this approach, the
MS/MS spectra are searched against a forward database and also against a decoy,
reversed or scrambled, protein database6. The score thresholds that separate correct from
incorrect peptide assignments are then altered to achieve a pre-determined false discovery
rate, defined as the quotient of the number of matches in the randomized decoy database
to the number of matches in the target database. While this approach can be used to
rapidly assess the quality of the overall set of peptide assignments, there are several
factors that need to be considered to accurately calculate the FDR. For instance,
construction of an appropriate decoy database is crucial, as the distribution of peptide
lengths, amino acid composition, and motif prevalence must be tuned to match that of the
species database and the specific enrichment experiment performed. In addition, replicate
identifications can artificially deflate the FDR if they are considered as independent tests.
Several dozen MS/MS spectra might be generated for an abundant species eluting from
83
the chromatography column (these replicate spectra are the basis for the label free
spectral counting quantitative approach); it is expected that each of these spectra will
match to the same peptide sequence in the forward database. Since these MS/MS spectra
are effectively replicates of the same MS/MS spectrum, if one of the spectra does not
match to the decoy database, then it is likely that all of the spectra will not match to the
decoy database. Instead of considering these as independent tests with several dozen hits
with no decoy hits, corresponding to a low FDR, this set of data should be considered as
one hit with no decoy hit. Considering these factors should significantly improve the
accuracy of the FDR-based statistical estimate of global data quality.
It is worth noting that the FDR-based statistical approach only provides a global quality
metric and fails to identify which assignments are true vs. false-positives, leaving doubt
about the validity of any given peptide assignment in the dataset. In fact, it is only on
further manual inspection that the quality of each peptide assignment becomes evident,
even for MS/MS spectra with similar scores for given assignments. For instance, two
peptide identifications with similar MASCOT scores are presented in Figure 2-­‐2. The
confidence in the accuracy of the assignment is much higher for the spectrum in Figure 2-­‐2a, as almost all fragment ions match to the expected theoretical fragment ions from the
assigned peptide. By comparison, in Figure 2-­‐2b there are multiple intense ions that do
not correspond to any of the typically theoretical fragment ions for the given peptide
assignment, and therefore this MS/MS spectrum is likely to represent either an incorrect
assignment or potentially a ‘contaminated’ spectrum resulting from the simultaneous
isolation and fragmentation of multiple ions. In most cases, the database score reflects
the number of matched fragment ions, but does not consider the number of unmatched
abundant ions, as can be seen by the varied percentage of unmatched ions for similar
database scores in Figure 2-­‐3. Based on this analysis, it appears that any global score
threshold will automatically include low-confidence identifications, defined as spectra
with a fair number of unassigned abundant fragment ions.
84
a)#
b)#
Figure 2-­‐2: MASCOT score only provides a rough estimate of the quality of the spectral assignment. Two scans with comparable MASCOT scores: (A) MASCOT score 27.12 and (B) MASCOT score 26.7 were selected. Note the significant disparity in the number of unassigned peaks in the two spectra. 85
Figure 2-­‐3: Variation in percentage of unassigned peaks for a given MASCOT score can also be visualized at the dataset level. For this analysis, all peptide matches with MASCOT score above 25 from a representative phosphotyrosine enriched analysis were included. In a given spectrum, all of the peaks above ten percent of maximum intensity or above 2.5 percent in a sparsely populated region within a +/-­‐
25 m/z window were included for consideration. An alternative to the FDR approach is to use machine learning-based techniques to
automate the validation process. A decision tree validation scheme has been shown to
reduce the FDR, yet still relies on searches against a general decoy database7. More
recently, a hybrid Support Vector Machine (SVM)/ Dynamic Bayes Network (DBN)
approach was used to classify MS/MS data, and was shown to increase positive
identifications in 1% FDR search results8. To circumvent the need for large amounts of
training data for classification methods, another approach is to create a rule-based
framework where prominent fragments are predicted based on expert criteria9, although
codifying experiential human knowledge still limits results to those peptides that match
prescribed criteria, therefore hindering generalization to peptides with different PTM’s.
The “expect” score in MASCOT provides another, peptide sequence specific, alternative
to the FDR. This score reflects the probability that a peptide assignment with a given
MASCOT score would occur by chance, taking into account the length of the peptide
86
along with the sequences of other peptides in the database to judge the likelihood of
proper assignment.
a)#
b)#
Figure 2-­‐4: Examples of spectra that were ultimately accepted into the data set. a) Exemplar green identification where the user and the computer picked the same assignment. b) Exemplar blue identification where the user rescued an identification initially excluded by the computer for having a Mascot score below 25. 87
a)#
b)#
Figure 2-­‐5: Exemplar yellow identification where the user changed the identification made by the computer (A) to (B). The decision was made based on prior knowledge of the sample preparation conditions, which included a pY immunoprecipitation. 88
Figure 2-­‐6: Exemplar red identification where the user rejected an identification initially included by the computer. A number of algorithmic approaches have been proposed to automate the proper
localization of PTM’s, one of the key factors in the quality of an assignment. Among the
most widely used of these algorithms, ASCORE defines the PTM site(s) by assignment
likelihood based on key fragment ion peaks that differentiate multiple putative
localizations10. PhosphoRS uses a similar scheme based on more of the ions in the
spectrum and has been applied to data sets from a more varied selection of instruments11.
Mascot Delta Score (MDScore) leverages the probabilistic calculations included in the
Mascot score2 and bases localization confidence on the difference between the Mascot
scores for the leading putative assignments. MDScore has been shown to achieve results
comparable to ASCORE for localizing tyrosine phosphorylation12. Each of these
algorithms is based on the principle that the PTM site assignment that matches the most
peaks is correct; however, localization to one out of several closely spaced residues is
often difficult, regardless of the algorithm. In these cases, the user can leverage prior
knowledge of a site’s biological relevance, a protein’s sequence homology, and the
purification procedures employed during sample preparation to assist in defining the most
likely assignment of a given site (Figure 2-­‐5).
89
From a larger perspective, all of these automated approaches face the difficult task of
limiting false positives while also limiting false negatives, thereby yielding the largest
dataset with fewest incorrect assignments. The balance between false-positives and false
negatives must be carefully considered, as the potential cost of false positives can be very
significant: false positives may distract research efforts and mislead experimental design,
potentially costing years of wasted effort5.
2.2 Manual Validation of MS/MS Data Despite the dilemma presented by the specter of false positives and the need to rapidly
validate large numbers of peptide assignments, the tools available to more rigorously
analyze a dataset have been limited. The gold standard for MS/MS peptide identification
verification is manual validation, where precursor and fragment mass observations are
manually evaluated against a theoretical fragment ion spectrum from the search algorithm
assignment13. Further confirmation can be achieved with synthesis of the putative peptide
and chromatography coelution experiments, although these additional steps incur
significantly more cost and effort and are therefore typically reserved for the most
interesting peptide matches14. Through manual validation, the intensity of particular
fragment ions, coupled with the presence, absence, and missed assignment of other
fragment ions, can be evaluated by mass spectrometry ‘experts’ to assess the strength of
the assignment. While there may be some variation in the particular implementation of
the manual validation process, efforts have been made to codify the process to ensure
consistency. Key objectives for the validation process (which peaks must be attributable
to the sequence, expected relative intensity of peaks for key fragments, PTM localization,
and criteria for exclusion) have been nicely summarized in previous works13. Correct
peptide sequences typically have all fragment ions over 10% of the base peak intensity
assigned to a specific fragmentation site in the peptide. As seen in Figure 2C, unassigned
fragment ions in the MS/MS spectrum suggest either an incorrect sequence or a mixed
spectrum; both of which result in decreased confidence in the accuracy of the
assignment.
Manual validation can be tedious and time-consuming, as tens of significant peaks from
each MS/MS spectrum must be individually aligned and matched to the mass of a known
90
fragment derived from the proposed peptide. When applied to a large-scale dataset, this
approach can take weeks, if not months, to individually assess each of the thousands of
MS/MS spectra, with most of the time spent meticulously comparing lists of numbers and
relatively little time required for the final decision as to whether one should include the
assignment in the dataset. There is a clear disconnect in the speed with which millions of
spectra can be acquired and the time required for spectral validation. How, then, to
increase the speed of the manual validation process without sacrificing the level of rigor
and confidence in each peptide identification? As it turns out, many of the tasks
associated with manual validation can be assisted by computer automation, leaving the
task of approval of a particular peptide assignment to a human decision while expediting
much of the tedious tasks of collecting relevant MS/MS spectra from a particular analysis
and calculating the predicted fragment ions of a particular peptide. Here, we describe a
Computer-Aided Manual Validation (CAMV) package that mitigates the time-consuming
portions of the validation task and presents the relevant information in a streamlined
format, allowing the user to rapidly judge the accuracy and quality of database
identifications. Application of CAMV to a given LC-MS/MS analysis leads to a highconfidence set of peptide identifications and a concise summary of any quantitative
information from iTRAQ or SILAC, all of which can be accomplished in hours instead of
days or weeks.
2.3 Computer Aided Manual Validation Package We have produced a computer-aided validation pipeline that expedites the validation
process without removing human judgment, helping to address the disconnect between
manual validation and high-throughput data generation while still maintaining data
quality. CAMV loads the MS/MS scans along with the putative assignment from the
search engine. Fragment ions are automatically labeled based on the sequence
assignment and according to a scoring rubric, which has been developed and tested to
assign the most likely labels to each fragment. To assist the user in assessing the quality
of the peptide assignment, additional information is provided in the output of CAMV,
including color-coded peak labeling, magnified view of the mass-to-charge range of the
MS scan around the precursor ion, and of the MS/MS scan around the iTRAQ marker ion
mass range (Figure 2-­‐7). Color-coded peak labels allow the user to rapidly identify
91
unlabeled or mislabeled peaks, both of which are particularly valuable when confirming
peptide sequence assignment or comparing multiple PTM localizations within a given
peptide sequence. Through this overall design, the various aspects of manual validation
software are apportioned to the most qualified entity: CAMV performs the most tedious
and time consuming tasks associated with peak labeling, and the user is presented with
the most relevant information to quickly make the correct decision. Once an analysis has
been user- verified, publication-ready figures and spreadsheets containing the appropriate
quantitative data can be generated from accepted assignments.
Figure 2-­‐7: The Graphical User Interface (GUI) for the MATLAB implementation of CAMV. The left tree contains a searchable list of protein hits in order of decreasing MASCOT score, each peptide assignment made to that protein, and all possible combinations of PTMs for each scan. The middle panel contains the MS2 scan data with pre-­‐labeled peaks. On the right is a survey of the precursor window of the MS1 scan and a view of any quantification information associated with the assignment. The peptide ladder at the top summarizes the sequencing information: red for an identified fragment, black for missing. A peak color-­‐coding scheme allows for rapid surveillance of the quality of the match: within tolerance (green), within 1.5x tolerance (magenta), unmatched peak (red), and isotopic peak (yellow). Peptides which are pre-­‐emptively excluded by the software appear in red on the left-­‐hand tree. For these peptides, the reason for pre-­‐emptive exclusion is displayed at the top left of the MS2 panel in place of the sequence ladder, along with an option to proceed with processing. 2.3.1.1
Data Preprocessing The typical workflow for CAMV analysis is represented schematically in Figure 4. Here
we describe the current configuration, although in most cases the specific software tool
embedded in the package that has been utilized for a given task could be replaced with an
alternate option. Initially, the raw mass spectrometry data file is converted into Mascot
Generic Format (MGF) through DTA Supercharge, which de-isotopes the MS data files,
92
converting precursor masses to the mono-isotopic masses. The MGF file is then searched
with MASCOT against the appropriate database, with the associated post-translational
modification options. The results of the MASCOT search are harvested as an XML file
with the input query data included, enabling one to match the MGF file query number to
the raw file scan number. To access the scan information in the MS raw data file, an
mzXML is produced using ‘msconvert’ from the Proteowizard package 15.
DTA'Supercharge'
Thermo'RAW'File'
Proteowizard'
MASCOT'Generic'File'(MGF)'
MASCOT'
mzXML'
MASCOT'Search'Results'(XML)'
ValidaBon'SoHware'
Validated'Spectra'
iTRAQ'QuanBficaBon'
Figure 2-­‐8: Peptide Sequence Identification and Validation Workflow. Raw data from the instrument is converted with DTA Supercharge into an MGF file that is searchable in MASCOT2. In parallel, the Proteowizard15 package is used to convert the RAW file into a format where the scan information is readily accessible. These two paths converge on the Validation Software, which matches the search results with the relevant scan data in preparation for user verification. Once complete, figures of validated scans are exported as PDFs and quantification information as a spreadsheet. 2.3.1.2
PTM Identification Search algorithm (e.g. MASCOT) results are retrieved as an XML file containing global
information about fixed and variable modifications used in the search. Fixed
93
modifications are applied to all instances of a given residue while variable modifications
may or may not be present on any instance of a given residue. Scan-specific
modifications to be applied to a particular MASCOT identification are included in the
XML file and are taken into account when generating fragment masses; terminal fixed
modifications are applied to every peptide.
2.3.1.3
Modified Sequence Generation Most of the current searching algorithms perform well at assigning the base peptide
sequence and the appropriate number of modifications, but perform poorly at determining
the correct site-specific localization of each modification. This poor performance is due
to the number and similarity of the various theoretical fragmentation patterns, as for each
amino acid sequence and set of variable modifications, several permutations may exist,
each of which may be distinguished typically by differences in 2 fragment ions. To
facilitate this PTM localization task, CAMV generates all permissible combinations with
a recursive search. Although the system is easily modifiable to include additional
modifications, in the current configuration, CAMV handles the following modifications:
Fixed:
•
iTRAQ labeling
•
Cysteine Carbamidomethylation
Variable:
•
Serine, Threonine, and Tyrosine Phosphorylation
•
Lysine Acetylation (concurrently with iTRAQ)
•
Methionine Oxidation
•
Light, Medium, and Heavy SILAC Labeling of Arginine and Lysine (not
concurrently with Lysine Acetylation or iTRAQ)
2.3.1.4
Theoretical Fragment Ion Generation For each candidate permutation of variable modifications, the full set of theoretical
fragment ion masses is generated. To calculate these masses, the N-terminal mass is
determined based on the type of iTRAQ (e.g. none, 4-plex, 8-plex) applied to the sample.
Each residue is added to the N-terminus in the proper order, with all resulting b-ion
masses recorded. With the full sequence, the precursor ion m/z values of charge states 1-5
94
are calculated and stored. Next, all permissible combinations of neutral losses from each
b-ion are calculated and stored. Permissible losses include: H2O, NH3, H3PO4 (from pSer
or pThr), HPO3 and HPO3+H2O (from pTyr), and SOCH3 (from carbamidomethyl C),
this list can be expanded as needed, but it is important not to expand to include unlikely
losses which can lead to over-fitting of the data (e.g. by assigning unlikely fragment ions
to the peaks, it becomes more difficult to differentiate false-positives from true positives).
Each b-ion and loss is calculated for charge states up to and including the charge state of
the precursor. Each b-ion and loss also produces an a-ion fragment with the loss of CO2.
The process is repeated from the C-terminus of the peptide to produce y-ions and the
corresponding losses. Additional frequently observed fragment ions are included: 216.04
(for pTyr) and bn+86 (for 8plex iTRAQ).
2.3.1.5
Fragment Alignment CAMV attempts to assign a label to all peaks in the MS/MS spectra whose intensity is
greater than 10% of the base peak intensity. To probe further into sparse regions of the
MS/MS spectrum, label assignments are attempted on all peaks whose intensity exceeds
2.5% of the base peak intensity if they are a local maximum in a sparsely populated
region (e.g. +/- 25 m/z). The accuracy of the fragment ion matching thresholds can be
adjusted for high- vs. low-resolution fragment ion spectra. The current configuration is
set for validation of low-resolution, linear ion trap CID spectra. With this setting,
predicted fragment masses that are within 1000 ppm (0.1% of the m/z) of the observed
peak are candidate matches that are denoted with a label and a green asterisk. Predicted
fragment ion masses that are within 1500 ppm (0.15% of the m/z) of the observed peak
may also be labeled and accompanied by a magenta asterisk to indicate the lower
confidence assignment. Isotope peaks of an identified peak are labeled with a yellow
asterisk. It is worth noting that it is straightforward to change the thresholds for these
matches to accommodate high mass accuracy, high resolution MS/MS data from a timeof-flight or orbitrap mass analyzer.
Peaks in the MS/MS spectrum may match to multiple different theoretical fragment ions.
To highlight the most likely match, we have developed a scoring rubric; the optimal label
for a given peak is assigned based on the highest score in the following rubric:
95
•
+12 for precursor fragments
•
+10 for b- or y-series ions
•
-1 for each neutral loss
•
+0.5 for closest match to the observed mass
With this rubric, we have tried to capture the most commonly occurring fragment ions
associated with correct peptide identifications. Depending on the collision energy used to
drive fragmentation, the most abundant ions in the MS/MS spectrum are typically those
associated with fragmentation of the peptide backbone (e.g. y- and b-type ions) and
neutral losses from the precursor ion. These candidates are therefore given the highest
scores in the above rubric. Each neutral loss from these main fragmentation events
becomes less likely so fewer points are awarded to these candidate fragment ion labels.
Effectively, if the peptide assignment is correct, then abundant ions in the MS/MS
spectrum are more likely to result to be a b- or y-type ion than to be a b- or y-type ion
that has undergone 4 for 5 neutral losses. In the current configuration of CAMV, the total
points associated with a spectrum are not considered in the final decision as to the quality
of the peptide assignment. However, it is conceivable that the number of high-scoring vs.
low-scoring fragment ion assignments could be reported as another metric to assist the
user in their evaluation of each spectrum. Further color-coding to separate high- and
low-scoring peaks is another option that could be considered in future versions of
CAMV. From an extensive amount of manual and computational analysis of true- vs.
false-positives, the number of unlabeled abundant fragment ions is often critical in
determining false-positives16. Therefore unlabeled abundant fragment ions, as defined by
observed peaks that do not have any predicted fragments within 1500 ppm, are labeled
with a red circle.
2.3.1.6
Label Renaming Since the scoring rubric does not consider all factors in assigning a label to a given peak
in the MS/MS spectrum, the user may change the assigned label. Clicking on an assigned
label will display a list of predicted fragment ions that match within tolerance, allowing
another label to be chosen. It is also possible to select a label outside of the tolerance
window, or outside of the user-defined fragmentation options listed above. This user-
96
defined label will be applied to the peak, but no mass will be calculated and it will be
shown with a black asterisk.
2.3.1.7
Quantitative Accuracy and MS Scan windows Importantly, CAMV also allows the user to assess the quantitative accuracy of isotopic
labeling strategies. For instance, by providing an image of the MS spectrum in the m/z
region of the precursor ion isolation window (right side, Figure 2-­‐7), users can determine
whether SILAC peaks might have overlapping contaminant peaks, as can occur in
complex mixtures, and then select whether the quantification of this pair should be
included in the final dataset. On a similar note, for iTRAQ quantification, the user can
determine whether another ion above a given intensity threshold was present in the
isolation window and may therefore have altered the iTRAQ marker quantification
values. The size of the isolation window highlighted by the gray box in this image can be
adjusted by the user according to the settings used for data collection. The image of the
MS spectrum in the m/z region of the precursor ion isolation window also allows the user
to evaluate whether the charge state of the precursor ion was assigned correctly, an
important consideration for low-level precursor ions in complex mixtures.
2.3.1.8
Validating Spectra The tree on the left-hand side of the GUI contains proteins rank-ordered by their
respective MASCOT scores and peptides for each protein listed alphabetically (first) and
numerically (second) in order of the MS/MS spectrum match (Figure 2-­‐7). For each
peptide matched to an MS/MS spectrum, nested under each entry is an assignment with
the appropriate number of modifications, initially marked by a filled gray circle. After
evaluating the peptide assignment based on all of the above criteria, the user may select
from one of three options located on the bottom right of the MS2 window: “Accept”,
“Maybe”, and “Reject”. The choice is recorded, so that the peptide assignments can be
sorted into lists for future reference, and displayed graphically by changing the color of
the filled circle in front of the assignment, allowing the user to keep track of their
progress. A search feature is included to locate peptide identifications based on scan
number or protein name.
97
2.3.1.9
Exporting Validated Spectra Two buttons on the bottom right of the GUI export PDF figures. “Print Accept List” will
print all accepted assignments to the “output\run_name\accept\” folder. Similarly, “Print
Maybe List” does the same for all assignments marked “Maybe”.
2.3.1.10 Exporting Quantification iTRAQ or SILAC quantification for all peptides marked “Accept” will be added to an
Excel spreadsheet in the “output\run_name\” folder simultaneously when the “Print
Accept List” button is pressed.
2.3.1.11 Preemptive Exclusion To facilitate manual validation, many spectra are pre-emptively removed from peak
assignment to reduce the time spent on enumerating and preprocessing large numbers of
sequences that will never be accepted. The criteria for pre-emptive exclusion include:
•
Excessively long peptide sequence
•
Low database score
•
Excessive number of PTM permutations
•
Excessive number of peaks to identify
•
Incomplete SILAC labeling
•
Poor MS1 data
•
Contaminated precursor window (user defined)
The heuristics for most of these criteria are user-definable in CAMV, and can be altered
depending on the experimental conditions. For instance, longer peptides are produced by
proteolytic digestion with some enzymes, and therefore the ‘excessively long peptide
sequence’ and ‘excessive number of peaks to identify’ settings might need to be adjusted.
An important feature of the CAMV is that these pre-emptively removed spectra are still
listed in the output (see Figure 2-­‐7), along with the reason(s) why the spectra were not
processed. For each of these spectra, the user can click the link, evaluate the reason for
removal, and request processing of the spectra. The spectra and peak labels are then
displayed for manual validation.
98
2.3.2
Exemplary Application to Tyrosine Phosphorylation Analysis To evaluate the application of CAMV to facilitate manual validation of MS/MS spectral
assignments, we selected an example data set that had already been manually curated,
thereby providing a benchmark in terms of MS/MS spectral assignment and speed of
analysis. The particular mass spectrometry dataset chosen was a quantitative analysis of
tyrosine phosphorylation in glioblastoma patient tumor xenografts, where accuracy in
terms of phosphorylation site identification and quantification were critical for
determining the signaling pathways in these tumors17. Since the sample preparation
involves a 2-stage enrichment for tyrosine phosphorylated peptides from a small amount
of starting material, the sample complexity is fairly low, and the total number of MS/MS
spectral assignments above a given threshold are less than one thousand.
As expected, CAMV radically accelerated the process while minimally impacting
the quality of the final dataset. In fact, the main benefit to the user is a drastic reduction
in the time necessary to validate an analysis. Previously, the manual validation process
would require days to weeks depending on the complexity of the sample. With CAMV,
the preprocessing step (not including database search) takes less than an hour; often, it is
only a matter of minutes for phosphotyrosine-enriched iTRAQ-labeled or SILACencoded samples. The user-friendly interactive interface with color-coded peak labels
makes decision-making on the various spectra fairly straightforward, and the total time
required for the analysis was reduced from days/weeks to hours. Although there is still
the need for hours of user time to validate the data, the result is a high confidence dataset
with minimal false positives, and virtually all of the user interaction time is focused on
decision-making rather than tedious table comparisons and arithmetic calculation.
99
Figure 2-­‐9: All peptide matches following user validation of the dataset. Color code: Agreement between algorithm and user 201/317(green), alternate identification chosen by user 4/317 (yellow), additional assignment included by user 79/317 (blue), and assignment rejected by user 33/317 (red). 100
Figure 2-­‐10: Distribution of user and CAMV decisions versus MASCOT score. Note that in many cases the user rescued the spectra that were pre-­‐emptively excluded by the software, while in other cases the user rejected the spectral assignment based on the poor quality of the match. The statistics from the final data set resulting from user validation assisted by the CAMV
are highlighted in Figure 2-­‐9. The final set contained 284 scans, 201 of which were
passed all of the internal criteria and were also selected as high confidence assignments
by the user (Figure 2-­‐9 green, also see Figure 2-­‐4a). Of the 284 spectra in the final data
set, the algorithm pre-emptively excluded 79 scans based on one or more of the criteria
listed above (Figure 2-­‐10 blue, also see Figure 2-­‐4b). For many of these 79 spectra, the
individual scan Mascot score was below 25, or there were too many peaks to be
identified. The heuristics underlying the pre-emptive exclusion decision are userdefinable and can be altered in the future to tailor the number of exclusions based on
experimental conditions and dataset quality. Because the GUI lists both accepted and
pre-emptively excluded assignments, it was fairly straightforward to rapidly check the
quality of the excluded spectra and rescue those that were high-confidence matches.
101
These false negatives were reintroduced to the dataset following user validation. Four
scans (Figure 2-­‐9 yellow, also see Figure 2-­‐5) were assigned to an alternate PTM state by
the user; although technically not false-positives, the site of modification was incorrectly
assigned by our scoring metric. Additionally, there were 33 false positive identifications
(Figure 2-­‐9 red, also see Figure 2-­‐6) where the user did not find a sufficient match to the
sequence identified by MASCOT and therefore removed the spectra from the final
dataset. The breakdown of accepted and rejected MS/MS spectra versus MASCOT score
is shown in Figure 2-­‐10 for the manually validated and CAMV software. Note that in
each bin there were some spectra that were initially rejected that were rescued by the user
and some spectra that would have been accepted based on a threshold scoring that were
rejected by the user during the manual validation process. At the end of the analysis,
despite a fairly rigorous initial threshold, CAMV required less than a day and removed
approximately 10% false positives, based on low confidence in the MACOT-identified
spectral assignments.
2.4 Conclusion CAMV is a software package to aid the manual validation process of MSMS peptide
identification data. This software package has drastically improved a tedious and timeconsuming task that vastly exceeded the sample analysis time, creating a backlog in the
workflow of many projects. By partially automating this process, we have shifted the
human focus to the decision-making portion of the task, allowing user judgment to be
rapidly applied. We hope that CAMV will provide researchers with a streamlined way to
perform their manual validation without having to rely solely upon false discovery rates
when analyzing and reporting their data.
102
2.5 Appendices 2.5.1
Package Availabilty The CAMV Software Package is freely available at http://web.mit.edu/icbp/data/ 2.5.2
Version Information All analyses were performed with the following software versions:
Proteowizard: Release 2.0.1749 (used for included data), 3.0.4323 (update included with
CAMV distribution)
Thermo XCalibur: 2.1.0 SP1.1162
DTA Supercharge: 2.0b1 (part of MSQUANT 2.0b1)
MATLAB: 7.13.0.564 (R2011b)
MATLAB MCR: 7.16
Mascot: Release 2.1.03, additional support for 2.4.1
2.5.3
Mascot Search Parameters Database: Human-2009 (37743 sequences; 17175626 residues)
Enzyme: Trypsin
Fixed modifications: iTRAQ8plex (K), iTRAQ8plex (N-term), Carbamidomethyl (C)
Variable modifications: Oxidation (M), Phospho (ST), Phospho (Y)
Mass values: Monoisotopic
Protein Mass: Unrestricted
Peptide Mass Tolerance: ± 10 ppm
Fragment Mass Tolerance: ± 0.8 Da
Max Missed Cleavages: 2
Instrument type: Josh-a-ion-LTQ
Number of queries: 14406
103
2.6 Bibliography 1.
Yates, J. R., Eng, J. K., McCormack, A. L. & Schieltz, D. Method to correlate
tandem mass spectra of modified peptides to amino acid sequences in the protein
database. Anal. Chem. 67, 1426–1436 (1995).
2.
Perkins, D. N., Pappin, D. J., Creasy, D. M. & Cottrell, J. S. Probability-based
protein identification by searching sequence databases using mass spectrometry
data. Electrophoresis 20, 3551–3567 (1999).
3.
Craig, R., Cortens, J. C., Fenyo, D. & Beavis, R. C. Using annotated peptide mass
spectrum libraries for protein identification. J. Proteome Res. 5, 1843–1849
(2006).
4.
Cox, J. et al. Andromeda: a peptide search engine integrated into the MaxQuant
environment. J. Proteome Res. 10, 1794–1805 (2011).
5.
White, F. M. The potential cost of high-throughput proteomics. Sci Signal 4, pe8
(2011).
6.
Käll, L., Storey, J. D., MacCoss, M. J. & Noble, W. S. Assigning significance to
peptides identified by tandem mass spectrometry using decoy databases. J.
Proteome Res. 7, 29–34 (2008).
7.
Elias, J. E. & Gygi, S. P. Target-decoy search strategy for increased confidence in
large-scale protein identifications by mass spectrometry. Nat Meth 4, 207–214
(2007).
8.
Klammer, A. A., Reynolds, S. M., Bilmes, J. A., MacCoss, M. J. & Noble, W. S.
Modeling peptide fragmentation with dynamic Bayesian networks for peptide
identification. Bioinformatics 24, i348–56 (2008).
9.
Martin, D. M. A. et al. Prophossi: automating expert validation of phosphopeptidespectrum matches from tandem mass spectrometry. Bioinformatics 26, 2153–2159
(2010).
10.
Beausoleil, S. A., Villén, J., Gerber, S. A., Rush, J. & Gygi, S. P. A probabilitybased approach for high-throughput protein phosphorylation analysis and site
localization. Nat. Biotechnol. 24, 1285–1292 (2006).
11.
Taus, T. et al. Universal and confident phosphorylation site localization using
phosphoRS. J. Proteome Res. 10, 5354–5362 (2011).
104
12.
Savitski, M. M. et al. Confident phosphorylation site localization using the Mascot
Delta Score. Molecular & Cellular Proteomics 10, M110.003830 (2011).
13.
Nichols, A. M. & White, F. M. in Methods In Molecular Biology 492, 143–160
(Humana Press, 2009).
14.
Zhang, J. et al. MS/MS/MS reveals false positive identification of histone serine
methylation. J. Proteome Res. 9, 585–594 (2010).
15.
Kessner, D., Chambers, M., Burke, R., Agus, D. & Mallick, P. ProteoWizard: open
source software for rapid proteomics tools development. Bioinformatics 24, 2534–
2536 (2008).
16.
Lahesmaa-Korpinen, A.-M., Carlson, S. M., White, F. M. & Hautaniemi, S.
Integrated data management and validation platform for phosphorylated tandem
mass spectrometry data. Proteomics 10, 3515–3524 (2010).
17.
Johnson, H. et al. Molecular characterization of EGFR and EGFRvIII signaling
networks in human glioblastoma tumor xenografts. Molecular & Cellular
Proteomics 11, 1724–1740 (2012).
105
106
3
Multiplex Absolute Regressed Quantification of Internal Standards (MARQUIS) 3.1 Introduction Innovations over the past decade in methodology and instrumentation have greatly
increased the capability of mass spectrometry to identify a significant fraction of the
proteome, including thousands of post-translational modifications (PTMs)1. Through the
use of quantitative strategies, including label-free proteomics, metabolic stable isotope
labeling, and chemical stable isotope labeling, it is possible to analyze two or more
biological samples and generate relative quantification data detailing differences in
protein expression or modification across different cellular conditions2,3. Multiplex
labeling has effectively improved the throughput of the approach, enabling the
comparison of multiple different biological samples in a single analysis, with
quantification typically determined relative to a basal condition or pooled standard.
Multiplexed relative quantification has been useful for elucidating temporal dynamics of
phosphorylation signaling following growth factor stimulation, with changes in protein
PTM levels in stimulated vs. non-stimulated conditions highlighting pathways involved
in signal processing and cellular response4. When combined with bioinformatic
algorithms, such as clustering, relative quantification of PTM level has enabled
researchers to predict the function of poorly characterized phosphorylation sites5.
Through quantification of downstream biological response and statistical analysis, key
sites have been identified as regulators of observed cellular behavior, providing further
functional and phenotypic association for novel phosphorylation sites6.
Despite the ability to generate biological insight with a combination of relative
quantification of phosphorylation dynamics and statistical modeling or bioinformatics,
additional information is encoded in the absolute levels of site-specific phosphorylation.
For instance, relative fold change of a signal carries limited information about the
network’s basal state or response to stimuli, as a two-fold change from a high basal state
(e.g. 30% to 60% activation) can lead to a very different biological response from a twofold change from a very low basal state (e.g. 1% to 2% activation). Absolute
quantification also enables comparison between multiple phosphorylation sites on a given
107
protein under different conditions. Since each phosphorylation site might recruit adaptor
proteins associated with particular signaling pathways, absolute quantification of
phosphorylation can provide critical data relating receptor stimulation and pathway
activation.
Absolute quantification data is not typically available from standard mass spectrometry,
western-blot, or reverse-phase protein array experiments, as the calibration curve is
unique to each phosphorylated peptide due to different response profiles (e.g. ionization
potential or antibody binding affinity). Therefore multiple methods for estimating
absolute quantification have been established. Among these, one of the most common
MS-based methods is isotope dilution, or AQUA, in which a synthetic isotope-encoded
peptide is added to the sample pre-analysis, and quantification is based on the relative
peak heights or area under the chromatographic elution curve for the endogenous and
synthetic peptide standard7. Although this technique is fairly straightforward, it relies on
single point calibration and may provide erroneous estimates if the endogenous peptides
span a large dynamic range across multiple conditions, or if the concentrations of these
peptides fall outside of the linear response range of the instrument. Alternate methods for
absolute quantification include the generation of standard curves through separate
analyses of standard peptides at known concentration7. Since the complex background of
the biological context can significantly alter the signal intensity of the endogenous
peptide due to competition for charge in the ionization process in MS, comparison to a
standard curve generated in a neat background can lead to significant estimation errors.
One solution would include different concentrations of a standard peptide into the sample
so that the standard and endogenous peptides experience the same local sample context,
but then one is faced with the issue of differentiating the standards from the endogenous
peptides. Ideally, one would like to combine the multiplexed capabilities of chemical
labeling with standard curves internal to the sample, thus allowing the accurate, absolute
quantification of given peptides across multiple biological conditions within a single
analysis.
With these goals in mind, we have developed Multiplex Absolute Regressed
Quantification of Internal Standards (MARQUIS), an MS-based technique to measure the
108
absolute amount of modification on many post-translational sites of interest
simultaneously from multiple samples within a single analysis. The method utilizes
multiple doses of internal standards to bracket a broad abundance range allowing
simultaneous analysis of treatment conditions expected to produce widely varied
endogenous signal magnitudes. We applied MARQUIS to quantify the absolute amount
of phosphorylation on several sites in the Epidermal growth factor receptor (EGFR)
network at different time points following stimulation of EGFR with different ligands.
Absolute quantification of phosphorylation dynamics in this system highlights novel
wiring within the network, insight that was not previously available from relative
quantification data. MARQUIS enables the acquisition of accurate absolute quantification
data, is applicable across multiple instrument platforms, and is equally applicable to
protein expression profiling and PTM quantification.
3.2 Methods Sample Preparation: MCF10A cells were serum-starved 24 hours prior to stimulation
with 20nM EGF (Peprotech) for 0.5, 1, 2, 3, 5, 10 or 30 minutes. Media was aspirated
and cells were lysed with 2ml of cold 8M urea with 1mM activated sodium
orthovanadate. Lysate was frozen at -80 until further processing. Protein yield was
quantified by BCA assay (Pierce) in order to ensure equal loading. Standard peptides
were added to samples as in Supplementary Table 1. Samples were reduced with 40µl
10mM DTT in ammonium acetate pH 8.9 for 1 hour at 56 C. Samples were alkylated
with 55mM iodoacetamide in ammonium acetate pH 8.9 for 1 hour at room temperature.
8mL ammonium acetate and 40µg of sequencing grade trypsin was added prior to 16
hours incubation at room temperature. Lysates were acidified with 1 mL glacial acetic
acid and protein was purified with Waters Sep-Pak columns. Samples were lyophilized
and subsequently labeled with iTRAQ 8plex (AbSciex) per manufacturer’s directions.
Standard Peptide Preparation and Quality Assurance: Standard peptides were
synthesized at the Koch Institute Biopolymers facility and amino acid analysis was
performed to measure concentration by AAA Service Laboratories (Damascus, OR).
Synthetic peptides were analyzed by LC MS/MS to confirm sequence and purity.
109
3.2.1
Multiple Reaction Monitoring (MRM): Immunoprecipitation: 70ul protein G agarose beads (calbiochem IP08) were rinsed in
400ul IP Buffer (100mM tris, 0.3% NP-40, pH 7.4) prior to an 8 hour incubation with a
cocktail of three phosphotyrosine-specific antibodies (12µg 4G10 (Millipore),12ug PT66
(Sigma), and 12ugPY100 (CST)) in 200ml IP Buffer. Antibody cocktail was removed
and beads were rinsed with 400ul of IP Buffer. Labeled samples were resuspended in
150ul iTRAQ IP Buffer (100mM Tris, 1% NP-40, pH 7.4) + 300ul milliQ water and pH
was adjusted to 7.4 (with 0.5M Tris HCl pH 8.5) prior to addition to prepared beads and
16 hour incubation. Supernatant was removed and beads were rinsed 3 times with 400ul
Rinse Buffer (100mM Tris HCl, pH 7.4). Peptides were eluted in 70ul of Elution Buffer
(100mM glycine, pH 2) for 30 minutes at room temperature.
Immobilized Metal Affinity Chromatography (IMAC) Purification: A fused silica
capillary (FSC) column (200µm ID x 10cm length) was packed with POROS 20MC
beads (Applied Biosystems cat.no. 1-5429-06). Column was rinsed with 100 mM EDTA
pH 8.9 in ultrapure water to remove residual iron and then rinsed with ultrapure water to
remove EDTA before fresh iron was deposited by rinsing with 100 mM FeCl3 in
ultrapure water. Excess iron was removed with 0.1% acetic acid prior to loading the IP
elution. The column was washed with 25% MeCN, 1% HOAc, 100 mM NaCl in
ultrapure water to remove non-specifically bound peptides and then rinsed with 0.1%
HOAc. Peptides were eluted with 250mM NaH2PO4 in ultrapure water and collected on a
pre-column (100µm ID x 10cm packed with 10µm C18 beads (YMC gel, ODS-A, 12nm,
S-10 µm, AA12S11)). This pre-column was rinsed with 0.2M acetic acid prior to LCMS
analysis.
Liquid Chromatography Mass Spectrometry: The pre-column was connected in-line
with an analytical column (50µm ID x 10cm packed with 5µm beads (YMC gel, ODSAQ, 12nm, S-5 µm, AQ12S05)) with integrated electrospray emitter tip [Martin 2000].
A 140 minute gradient from aqueous (0.2M acetic acid) and organic phase (0.2M HOAc,
70% MeCN) was used to separate peptides.
Absolute Quantification: Analyses were performed on a Thermo TSQ Quantum in
MRM mode. Endogenous and heavy-labeled versions of each peptide of interest were
110
tracked throughout the experiment with two sequence specific transitions
(precursor/prominent b or y ion) and eight iTRAQ transitions (precursor/label peak).
Elution time was determined based on coelution of iTRAQ transitions with sequence
specific transitions for both standard and endogenous peptides of the MARQIS pair. Area
Under the Curve (AUC) was collected for each transition. Data were corrected by iTRAQ
isotope purity and synthetic peptide loading control. A calibration curve was created for
each peptide to relate the AUC of the iTRAQ transition to the amount of standard
included. MATLAB function lscov was used with the intensity of the signal also serving
to appropriately scale the variance. This calibration curve was then used to calculate the
amount of peptide present in each of the endogenous conditions based on the
corresponding iTRAQ transition AUC.
3.2.2
Parallel Reaction Monitoring (PRM): Immunoprecipitation: 20 µl protein G agarose beads (Calbiochem, IP04-1.5ML) were
pre-incubated with 20µl of PT66 and 20µl of pY100, prior to adding sample
(reconstituted in 70µl HEPES buffer adjusted with 5M NaOH). After overnight
incubation, beads were rinsed and peptides were eluted with 20µL of 0.1% TFA, 10%
MeCN, 250ng/µL HeLa Digest (phospho depleted).
NTA Enrichment: Elution incubated with Fe(III)-NTA magnetic beads for 15 minutes,
rinsed and eluted with 10µL of NH4OH (1:30), 5mM EDTA, 5% MeCN, 1ng/µl HeLa
digest. Elution was transferred to an autosampler vial and pH was adjusted with 5µl of
10% TFA.
Liquid Chromatography Mass Spectrometry: LC-MS was performed using a 25cm,
75µm ID, EASY-Spray column (ThermoFisher Scientific, ES802) with a 120 minute
gradient and a flow rate of 300µl/min.
Absolute Quantification: Analyses were performed on a Thermo Q-Exactive.
Endogenous and heavy-labeled versions of each peptide were targeted for MS2 analysis
during their expected elution times. In the MS1 scan, the total amount of endogenous
peptide was calculated based on the fraction of MS1 integrated intensity when compared
to the standard peptide. This amount was then apportioned across the contributing
111
samples based on relative iTRAQ intensity. A single point measurement where the MS2
scan used to determine the iTRAQ intensity ratios was chosen based on maximal MS1
intensity to assure maximal signal to noise ratio.
3.3 Results 3.3.1
Assessing the requirement for standard peptides for absolute quantification of tyrosine phosphorylation Theoretically, absolute quantification of protein expression or phosphorylation could be
calculated directly from MS data without internal standards by quantifying the signal
amplitude for the peptide measured in the full scan mass spectrum or from fragment ion
transitions. To assess the need for internal standards, we selected several tyrosine
phosphorylation sites in the EGFR signaling network, generated synthetic phosphorylated
peptides encompassing these sites, and added these peptides at defined concentrations to
cell lysate of MCF10A cells. Quantification of the signal intensity for each peptide across
multiple different analyses demonstrated a significant amount of run-to-run variation
(three representative examples are presented in Figure 3-­‐1), likely associated with small
flow rate fluctuations combined with differences in temporal sampling across the
chromatographic elution profile.
b)#
6.E+07'
6.E+07'
6.E+06'
6.E+06'
6.E+05'
c)#
Src'pY418'
Signal'
Signal'
EGFR'pY1173'
3.E+07'
6.E+05'
3.E+06'
3.E+05'
6.E+04'
6.E+04'
6.E+03'
1.E+00' 1.E+01' 1.E+02' 1.E+03' 1.E+04'
6.E+03'
1.E+00' 1.E+01' 1.E+02' 1.E+03' 1.E+04'
Amount'(fmol)'
Shc'pY317'
Signal'
a)#
Amount'(fmol)'
3.E+04'
1.E+00' 1.E+01' 1.E+02' 1.E+03' 1.E+04'
Amount'(fmol)'
Figure 3-­‐1: Variability of Signal Intensity from technical replicates. The signal intensity vs. amount of peptide is plotted for three representative peptides across five technical replicate analyses. To quantify this data across multiple concentrations of each peptide, we calculated the
response ratio, the ratio between signal amplitude and peptide concentration, for 13
peptides in the EGFR signaling network (Figure 3-­‐2). As the data demonstrates, for a
given concentration, the response ratio can vary by two to three orders of magnitude. This
112
variation is likely associated with differences in ionization potential of the peptides and
their respective co-eluting species, due to competition for charge in electrospray
ionization. Together, the broad range in peptide-specific response ratio and the technical
replicate variation make it difficult to directly relate signal intensity and abundance,
thereby highlighting the need for an improved method to generate accurate absolute
quantification data.
Figure 3-­‐2: Peptide Specific Signal Intensity. The average integrated intensity vs. amount of all heavy-­‐
labeled standard peptides. The large range is due to sequence specific effects compiled throughout sample processing, liquid chromatography, and ionization. 3.3.2
Multiplex Absolute Regressed Quantification of Internal Standards (MARQUIS) To quantify the absolute levels of endogenous peptides across a broad dynamic range, we
have developed the MARQUIS (Multiplex Absolute Regressed Quantification of Internal
Standards) method. Absolute quantification in MARQUIS is based on comparison of the
endogenous peptides to paired heavy-isotope labeled internal standards added at specific
concentrations into multiple biological samples (Fig. 2). Briefly, synthetic peptides
containing a heavy-isotope labeled amino acid were generated for each phosphorylated
113
peptide of interest and stock concentrations were established by amino-acid analysis
(AAA). A standard peptide cocktail was then added to cell lysates; one concentration per
lysate. These lysates were purified and digested prior to stable isotope labeling (iTRAQ)
and combination. The mix of endogenous and synthetic peptides was subjected to
targeted tandem mass spectrometry for iTRAQ quantification and peptide sequence
confirmation. Fragmentation of the iTRAQ-labeled synthetic, heavy-labeled
phosphopeptide produced an internal standard curve with each calibration point
corresponding to the unique amount of synthetic peptide added to the corresponding
biological sample. Similarly, fragmentation of the iTRAQ-labeled endogenous peptide
also produced iTRAQ marker ions, with the intensity of each ion corresponding to the
amount of endogenous peptide in each biological sample. Calibration of the iTRAQ
marker ion intensities from the endogenous peptide against the standard curve provided
absolute quantification for each biological condition. The internal standard curve
establishes the linear dynamic range around the endogenous peptide concentration,
providing an accurate intrinsic quality assessment of the peptide quantification.
Since synthetic peptides are added early in the processing workflow, losses are identical
for the endogenous and synthetic peptides as both experience identical environments
through all stages of preparation. Moreover, since the endogenous and heavy-labeled
peptides co-elute, the peptides experience identical chromatographic and ionization
contexts, removing an additional potential source of quantification error.
114
P"
P"
P"
P"
P"
P"
P"
P"
P"
P"
P"
P"
P"
P"
P"
P"
P"
P"
Y"
LC&MS"
Y"
MS"
pY"IP"
P"
P"
P"
MS/MS"
P"
P"
P"
Y" Y"
Y"
Y"
Figure 3-­‐3: MARQUIS Method for Absolute Quantification. Synthetic isotope-­‐labeled phosphorylated peptides are added during cell lysis. Following sample processing and digestion, all peptides, synthetic and endogenous, are iTRAQ labeled. For this particular application of the method, tyrosine phosphorylated peptides are enriched by peptide immunoprecipitation followed by immobilized metal affinity chromatography (IMAC) and analyzed by LC-­‐MS/MS. Heavy-­‐labeled standards provide an internal calibration curve, enabling absolute quantification of endogenous peptides. 3.3.3
Performance of the method To assess the performance of the MARQUIS method, heavy-labeled synthetic
phosphorylated peptides were generated for selected tyrosine phosphorylation sites in the
EGFR signaling network. Synthetic peptides were added to EGF-stimulated MCF10A
cell lysate, thereby providing a complex, biologically-relevant background for testing the
response profile, linear dynamic range, and sensitivity of this method. MARQUIS was
initially evaluated on a triple quadrupole instrument operating in multiple reaction
monitoring (MRM) mode, with 10 precursor-fragment ion transitions per peptide,
including 2 sequence specific and 8 iTRAQ marker ion transitions (Figure 3-­‐4 and
Appendix Table 3-­‐1). It is important to note that the number of phosphorylation sites able
to be monitored with this approach is inherently limited by the number of transitions.
With 10 transitions per peptide and 2 peptides (synthetic and endogenous) per
phosphorylation site, the total number of transitions increases rapidly as more sites in the
115
network are monitored. Increasing the number of transitions translates to fewer
measurements per peptide across the chromatographic elution profile, leading to suboptimal quantitative data.
Figure 3-­‐4: MRM Quantification of endogenous tyrosine phosphorylated peptides. Raw iTRAQ intensities for endogenous peptides (columns E-­‐L) and corresponding heavy-­‐labeled standard peptides (columns M-­‐T) across technical replicate analysis of EGF stimulated MCF10A cells. Calculated amounts of endogenous peptides based on regression to the standard curve are contained in columns U-­‐AB. To address this issue, we also evaluated MARQUIS on an Orbitrap-based instrument
(QExactive) using targeted tandem mass spectrometry, also known as parallel reaction
monitoring (PRM). In this approach, a full scan, high-resolution MS spectrum is
acquired followed by full scan, high-resolution MS/MS spectra containing both iTRAQ
marker ions and sequence specific peptide fragmentation for each peptide (Figure 3-­‐5).
Since the endogenous and heavy-labeled peptides are both detected in the MS scan,
quantitative comparison of the precursor ion intensities provides an estimate of the total
amount of endogenous peptide present in the sample (Appendix Table 3-­‐2). The relative
iTRAQ marker ion intensities from the MS/MS spectrum for the endogenous peptide then
apportion this total amount across the biological samples. iTRAQ marker ion intensities
from the MS/MS of the standard peptide serve as an important control by providing a
116
peptide-specific estimate of the accuracy and linear dynamic range. For both the MRM
and PRM methods, we selected 13 phosphorylation sites to assess the performance of
MARQUIS quantification at different peptide concentrations (Appendix Table 3-­‐3).
Figure 3-­‐5: Details of PRM Data Collection Method. a) Precursor masses for the endogenous (red) and standard (blue) Shc pY317 peptide can be individually isolated for subsequent fragmentation. b) The full scan MS2 result of individually isolated and fragmented peptides; endogenous (red), standard (blue). The iTRAQ reporter ion region has been magnified to illustrate the relative quantification data available from the full MS2 scan. Figure 3-­‐6a-b show representative data extracted for two standard peptides using MRM
and PRM. At the high end of the curve, both methods perform identically. For EGFR
pY1173 (Figure 3-­‐6a), the linear range for both MRM and PRM extends from 3000 fmol
to 100 fmol; between 100 fmol and 3 fmol the response remains linear, but deviates from
the theoretical response due to additional noise in the experimental measurement. For
Shc pY317 (Figure 3-­‐6b) the PRM method performs well with a dynamic range of
approximately 1000-fold (3000 fmol to 3 fmol). The MRM method for this site did not
perform as well, with suspected contamination on the iTRAQ-118 (10 fmol) channel
boosting the recorded intensity significantly. This contamination does not appear to affect
the iTRAQ-119 (3 fmol) or iTRAQ-121 (1 fmol) channels as both show measurably
117
lower values than iTRAQ-117 (30 fmol). Contamination may appear in the MRM
method and not the PRM method due to the ability of the high resolution PRM method to
distinguish contaminant ions from iTRAQ marker ions, whereas the MRM measurement
detects the number of ions passing a set bandwidth in the third quadrupole. It is worth
noting that altering the dwell time per transition for MRM or the ion target value and
maximum fill time for the full scan PRM method can affect the sensitivity of the analysis
and thereby improve the linear dynamic range, but increasing these values comes at the
expense of cycle time.
To evaluate the reproducibility of quantification using the MARQUIS method, we
performed technical replicates of each experiment on the MRM and PRM platforms and
plotted the concentration vs. coefficient of variation (CV) for each measurement in the
data set (Figure 3-­‐6c). Although both methods provide high quality data (e.g. most
measurements have CV <0.15) across the entire range, the most reproducible
measurements are obtained above 100 fmol, with CV values <0.1. At the low end of the
concentration range, some peptides display CV >0.5. For MRM, the early time points of
ERK1 and ERK2 signals exhibited the highest CV, likely due to the higher standard
doses included in the corresponding channels. The highest CV’s for the PRM technique
were those of the dual phosphorylated EGFR peptide pY1045/pY1068. This peptide has a
lower response ratio (Figure 3-­‐2) and elutes very late in the chromatographic program,
making it more susceptible to contamination from background chemical noise. Overall,
both techniques were comparable in terms of the fraction of measurements that met CV
thresholds (Figure 3-­‐6d) at a CV of 0.20.
118
b)#
EGFR$pY1173$
100$
10$
Expected$
1$
MRM$
0.1$
PRM$
Percentage$of$Total$Signal$
Percentage$of$Total$Signal$
a)#
Shc$pY317$
100$
10$
Expected$
1$
MRM$
0.1$
PRM$
0.01$
0.01$
3000$ 1000$ 300$ 100$
30$
10$
3$
3000$ 1000$ 300$ 100$
1$
0.7$
MRM$
0.6$
PRM$
d)#
FracHon$below$CV$Threshold$
c)#0.8$
0.5$
CV$
30$
10$
3$
1$
Amount$(fmol)$
Amount$(fmol)$
0.4$
0.3$
0.2$
0.1$
0$
1.2$
1$
0.8$
0.6$
MRM$
0.4$
PRM$
0.2$
0$
1$
10$
100$
1000$
Amount$(fmol)$
10000$
0$
0.2$
0.4$
0.6$
0.8$
1$
CV$
Figure 3-­‐6: Quality Comparisons. Observed vs. Expected total iTRAQ signal contribution for MRM and full-­‐scan methods for two sites: EGFR pY1173 (a) and Shc pY317 (b). c) Coefficient of Variation vs. Amount for MRM and Full Scan QE methods. d) Fraction of measurements that meet Coefficient of Variation (CV) thresholds for both techniques. 3.3.4
Isolation Width Effects In the MARQUIS method, both the endogenous and the heavy-labeled synthetic peptides
are iTRAQ labeled, allowing for direct comparison of iTRAQ marker ion intensities
resulting from fragmentation of the corresponding peptides. This approach has the
potential for error associated with contamination of the isolation window from other
endogenous or synthetic peptides, or potentially from co-isolation of the endogenous and
heavy-labeled peptides. As the charge state of the peptide increases, the separation, in
m/z, between the endogenous and heavy-labeled synthetic peptide decreases, thereby
increasing the likelihood of co-isolation. To assess the effect of the isolation width on
MARQUIS quantification accuracy for the MRM platform, we narrowed the isolation
width on the first quadrupole from 0.7m/z to 0.1m/z. Narrowing the MS1 isolation
window can decrease the level of contamination in the ions selected for fragmentation,
but also significantly decreases the transmission efficiency. To determine if this trade-off
was beneficial, we compared the measured signal intensities from all peptides from
analyses run with both isolation widths (Figure 3-­‐7). Minimal effects of altered isolation
119
width were seen for the highest peptide levels, whereas only the 0.1m/z isolation window
was able to clearly distinguish the median of the 1fmol standard from the median of the
10fmol standard. At the lowest peptide levels, the narrower isolation width was better
able to isolate precursor ions, ultimately providing a larger dynamic range for calibration.
However, for peptides with poor response ratios, the losses associated with the narrower
isolation width ultimately limit the sensitivity and therefore the dynamic range.
Frac-on$of$Signal$Intensity$
1$
0.1$
0.7$m/z$
0.1$m/z$
0.01$
0.001$
$$$$1$fmol$
$$$$10$fmol$
$$$$100$fmol$
$$$1000$fmol$
Figure 3-­‐7: Q1 isolation Width Comparison. 0.7 m/z (red) and 0.1 m/z (blue). 3.3.5
Absolute Quantification of Tyrosine Phosphorylation in the EGFR signaling network Previous studies have described the temporal dynamics of individual tyrosine
phosphorylation sites following EGFR activation8. Although these studies have
highlighted many novel phosphorylation sites in the network and have led to hypotheses
regarding the function of selected sites, they have been inherently limited by
quantification relative to a selected time point. As such, it has been difficult to quantify
120
differences in phosphorylation levels across different sites on the receptor, or to
determine whether different ligands led to different phosphorylation stoichiometries. To
address this deficiency and gain insight into the temporal phosphorylation profiles of
multiple sites in the network, we used MARQUIS to quantify absolute levels of tyrosine
phosphorylation on EGFR and proximal adaptor proteins, at multiple time points
following EGF stimulation. Although EGFR pY1173 is often cited as a dominant
autophosphorylation site for the receptor9, and is often used as a measure of EGFR
activation for immunoblotting studies10,11, MARQUIS data unequivocally demonstrates
that phosphorylation of 1148 occurs approximately 4-5 fold greater than pY1173,
pY1068, or pY1045, all of which occur at similar levels (Figure 3-­‐8a), a comparison that
would be missed by typical MS-based relative quantification6 or immunoblotting
techniques. Further downstream in the network, it is also possible to detect higher levels
of phosphorylation on Erk2 pY187 as compared to Erk1 pY204 (Figure 3-­‐8b), an
observation that aligns with higher ERK2 expression in other systems12.
EGFR"
pY1045"
b)# 1800"
pY1068"
400"
pY1148"
1200"
pY1173"
1000"
300"
ERK"
1600"
500"
200"
ERK2"pY187"
800"
600"
400"
100"
200"
0"
0"
0"
5"
10"
15"
20"
25"
30"
0"
5"
10"
Time"(min.)"
c)# 600"
15"
20"
25"
30"
Time"(min.)"
d)# 350"
EGFR"S1142/Y1148"
EGFR"Y1045/Y1068"
300"
500"
250"
pSpY"
300"
fmol"
SpY"
400"
fmol"
ERK1"pY204"
1400"
fmol"
fmol"
a)# 600"
pYY"
200"
YpY"
150"
pYpY"
200"
100"
100"
50"
0"
0"
0"
5"
10"
15"
Time"(min.)"
20"
25"
30"
0"
5"
10"
15"
20"
25"
30"
Time"(min.)"
Figure 3-­‐8: Stoichiometric comparison of site-­‐specific phosphorylation. Data for a collection of EGFR and network phosphorylation sites was gathered using MARQIS with Multiple Reaction Monitoring (MRM) on a triple quadrupole instrument. a) EGFR. b) ERK1/2. c) EGFR pY1148, pS1142/pY1148. d) EGFR pY1045, pY1068, and pY1045/pY1068. 121
Absolute quantification enables the tracking of individual phosphorylation sites, even
when multiple sites are contained within the same tryptic peptide. For instance, the
peptide containing Y1148 also contains S1142, a serine residue that has previously been
found to be phosphorylated in these cells13. To identify whether pS1142 co-occurs with
pY1148 on the same EGFR protein, we used MARQUIS for absolute quantification of
the singly phosphorylated pY1148 peptide and the doubly phosphorylated
pS1142/pY1148 peptide. By quantifying the singly and doubly phosphorylated isoforms
of the peptide, we could more accurately account for the total amount of phosphorylation
on Y1148 (Figure 3-­‐8c). This analysis led to the intriguing finding that pS1142/pY1148 is
present in unstimulated cells at approximately 30fmol, but upon stimulation gradually
decreases over time, with no appreciable increase following stimulation. The singly
phosphorylated pY1148 is also present in unstimulated cells, but increases strongly
following EGF stimulation. With absolute quantification, it is clear that the increase in
phosphorylation of the singly phosphorylated pY1148 is much greater than the decrease
in the doubly phosphorylated pS1142/pY1148 peptide. Therefore, the increase in singly
phosphorylated pY1148 is likely due to phosphorylation of EGFR in the absence of
pS1142, rather than de-phosphorylation of the pS1142/pY1148 species. It remains to be
determined whether pS1142 inhibits phosphorylation of Y1148, but our data would be
consistent with this hypothesis.
Additional insights into phosphorylation dynamics are available when multiple tyrosine
phosphorylation events occur on a given tryptic peptide. For instance, Y1045 and Y1068
fall within the same tryptic fragment, and therefore yield four possible phosphorylation
isoforms for the peptide (Y1045/Y1068, pY1045/Y1068, Y1045/pY1068, and
pY1045/pY1068), three of which were monitored by MARQUIS-based absolute
quantification (Figure 3-­‐8d). Although all three isoforms are present at similar, low levels
in unstimulated MCF10A cells, following stimulation Y1045/pY1068 and doubly
phosphorylated pY1045/pY1068 both rise rapidly in the first minute. The
pY1045/Y1068 peptide shows a much more gradual increase in the first minute,
potentially indicating that this site is being converted to the pY1045/pY1068 form
through rapid phosphorylation of Y1068. Interestingly, over the next several time points
the singly phosphorylated Y1045/pY1068 isoform begins to decrease, while the
122
pY1045/Y1068 and pY1045/pY1068 isoforms both increase strongly, indicating
phosphorylation of Y1045 that is converting Y1045/pY1068 into the doubly
phosphorylated form, which is the most abundant of the three isoforms in this early time
period. Finally, by five minutes all three of the phospho-isoforms begin to decrease at
similar rates, potentially indicating the activity of a common phosphatase. The
quantitative interplay between these isoforms has not been previously documented due to
a lack of absolute quantification of phosphorylation level on each of these sites under
different stimulation conditions.
MARQUIS-based absolute quantification was also used to quantify EGFR
phosphorylation dynamics in response to different EGFR ligands. It has previously been
documented that stimulation with Epidermal Growth Factor (EGF), Transforming
Growth Factor α (TGFα), or Amphgiregulin (AREG) leads to different downstream
biological effects14-16, potentially due to qualitatively or quantitatively different signaling
networks. To uncover the mechanism by which these different ligands activate different
phenotypes, we quantified the temporal dynamics of phosphorylation levels on multiple
sites on the receptor in response to different concentrations of each ligand (Appendix
Table 3-­‐4). As can be seen in Figure 3-­‐9a-c, each ligand stimulates a rapid increase in
receptor phosphorylation, albeit to different levels, with the Y1148 site consistently
phosphorylated to a greater extent compared to the other three phosphorylation sites on
the C-terminal tail of the receptor. Given that each ligand has different binding affinities,
the difference in the level of phosphorylation on the receptor may not be surprising.
Intriguingly, despite the quantitative difference in the level of phosphorylation due to
each ligand (Figure 3-­‐9d), the pattern of phosphorylation on the receptor is effectively
invariant with regard to ligand identity, with pY1148>pY1068>pY1173>pY1045; this
similarity in receptor pattern can best be visualized by normalizing all sites for a given
ligand to the total amount of phosphorylation on the receptor (Figure 3-­‐9e). The
consistent pattern of receptor phosphorylation raises the intriguing possibility that
biological response to different EGFR ligands may be encoded in the absolute level of
phosphorylation on each site, rather than through the phosphorylation of different sites on
the receptor (e.g. quantitative vs. qualitative control of RTK phosphorylation signaling
123
networks). It remains to be determined how quantitative control might affect downstream
signaling networks and therefore regulate cell biological response to ligand stimulation.
b)#
a)#
60"
c)#
60"
EGF"20"nM"
30"
20"
10"
Copies/cell"(x103)"
40"
40"
30"
20"
0"
0"
10"
20"
30"
0"
Time"(min.)"
d)#
10"
20"
Time"(min.)"
30"
e)#
40"
Copies/cell"(x103)"
35"
30"
25"
TGFα"20nM"
20"
AREG"20nM"
15"
EGF"20nM"
10"
5"
0"
pY"1068"
pY"1148"
pY"1173"
Median"Normalized"Copies/cell"
45"
pY"1045"
EGFR"pY"1045"
EGFR"pY"1068"
40"
EGFR"pY"1148"
30"
EGFR"pY"1173"
20"
10"
10"
0"
AREG"20"nM"
50"
50"
Copies/cell"(x103)"
Copies/cell"(x103)"
50"
60"
TGFα"20"nM"
0"
0"
10"
20"
30"
Time"(min.)"
3"
2.5"
2"
TGFα"20nM"
1.5"
AREG"20nM"
EGF"20nM"
1"
0.5"
0"
pY"1045"
pY"1068"
pY"1148"
pY"1173"
Figure 3-­‐9: Ligand Comparison. Absolute quantification for EGFR phosphorylation sites treated with EGF (a), TGFα (b), and AREG (c). Comparison of site-­‐specific phosphorylation between ligands following five minutes of ligand stimulation for two biological replicates; raw (d) and normalized (e). 3.4 Conclusion MARQUIS is a novel multi-point internal standard calibration method that enables
absolute quantification of site-specific PTM levels or protein expression. The method
provides accurate quantification over a broad range of concentrations allowing
simultaneous analysis of stimulation conditions that lead to large changes in expression
or modification levels. With MARQUIS, absolute quantification data can be collected for
many PTM sites of interest in the network in a single analysis, using either MRM or
PRM-based targeted mass spectrometry. This data can then be used for intra- and intersample site-specific comparisons.
124
MARQUIS was designed for MRM-based analysis on a triple quadrupole, but the method
is equally applicable to targeted full-scan targeted tandem mass spectrometry (PRM).
Overall, the two methods performed very well in terms of technical reproducibility with a
CV threshold of 0.15 able to capture the majority of measurements with both techniques.
Both instrument platforms offer selective advantages. For instance, the full-scan method
offers the ability to determine contamination by way of the complete MS/MS data at high
resolution that is not available in an MRM experiment. On the other hand, when using
MRM, the precursor (Q1) isolation window can be tightened to a greater extent compared
to the PRM method, thereby improving signal-to-noise by reducing contributions from
spurious co-eluting ions. Although a Q1 isolation width of 0.7m/z was typically used for
the data acquired in this study, sufficient signal for quantification can be attained for
many peptides by using a window as small as 0.1m/z. Such a narrow window would be
deleterious to the PRM method as the drop in ion flux may eliminate signal for most low
and medium intensity fragments. The MRM approach is inherently more flexible, as the
fragmentation energy and time spent per MRM transition can be altered for selected
transitions to improve the sensitivity. However, increasing the dwell time is only feasible
for a small number of transitions; otherwise the increase in cycle time becomes too great.
Recent structural insights into the cooperativity of ligand binding and EGFR dimerization
suggest that differential ligand affinities for the extracellular portion of the receptor
influence the occupancy and the receptor conformation and potentially signaling and
trafficking17. A previous mass spectrometry-based study of receptor phosphorylation
observed site-specific quantitative differences in receptor phosphorylation in response to
varied doses of TGFα and EGF; stoichiometry was estimated by measuring the ratio of
signal intensities between phosphorylated and non-phosphorylated versions of tyrosine
containing tryptic fragments18. Such comparisons are limited due to the differential
ionization potential between these species. Here we have shown large differences not
only between various ligand stimulation conditions for a single site, but also between
different EGFR sites not present on the same tryptic peptide.
The MARQUIS method enables quantification of EGFR phosphorylation stoichiometry,
information necessary to decipher how extracellular ligand concentration information is
125
transmitted across the cellular membrane. Unique sets of SH2- and PTB-containing
proteins bind to each receptor phosphorylation site19,20, suggesting the ability to activate
distinct pathways by phosphorylating different sites. Alternately, differential pathway
activation could be controlled by altering the level of phosphorylation. While this option
might suggest that downstream pathways will be linearly proportional to receptor
phosphorylation levels, different pathways feature amplification gains, and thus one
pathway might be saturated by a relatively small signal on the receptor, as has been
detected for the ERK MAPK pathway, while other pathways may be more analog in
nature. Here, through absolute quantification of specific phosphorylation sites on the
receptor, we determined that the EGFR phosphorylation response to stimulation with
different EGFR ligands is quantitatively regulated. Although phosphorylation levels are
ligand specific, the ratio between different receptor sites is constant across different
EGFR ligands. Thus, different biological responses may be regulated by the analog vs.
digital nature of the different downstream pathways, in a manner analogous to the
interleukin receptor system we have recently investigated21. Additional studies are
needed to confirm the effects of quantitative regulation of receptor phosphorylation on
downstream pathways, as well as to determine the cellular conditions that might lead to
altered phosphorylation patterns on the receptor: is this a mechanism by which the cell
differentiates normal vs. pathological signaling?
Absolute quantification of protein expression and PTM levels will be required to provide
insight into regulation of many biological systems. The MARQUIS method enables
facile acquisition of absolute quantification data across multiple biological conditions
simultaneously, with large dynamic range, and with internal standard curves to improve
the accuracy of the measurements. This method is broadly applicable to multiple mass
spectrometry platforms and to different biological systems.
126
3.5 Acknowledgments We thank Andreas FR Huhmer for his support of the project by providing access to the
Thermo Scientific Q Exactive instrument and for the critical review of the manuscript.
The authors would like to thanks members of the White lab for helpful discussion. This
work was supported by in part by a grant from the US-Israel Binational Science
Foundation and by NIH grants U54 CA112967, R01 CA118705, and R01 CA096504.
127
3.6 Appendix Tables Protein
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
ERK1
ERK1
ERK1
ERK1
ERK1
ERK1
ERK1
ERK1
ERK1
ERK1
ERK2
ERK2
ERK2
ERK2
ERK2
ERK2
ERK2
ERK2
ERK2
ERK2
Shc
Shc
Shc
Shc
Shc
Shp2
Shp2
Shp2
Shp2
Shp2
Src
Src
Src
Src
Src
2.01E+04
5.65E+04
4.29E+04
7.40E+04
4.14E+04
1.14E+04
6.95E+03
1.97E+04
1.41E+04
1.25E+07
8.23E+06
3.63E+06
4.99E+06
3.01E+06
9.37E+05
1.70E+06
5.34E+06
3.42E+06
6.46E+05
1.98E+06
4.91E+05
3.10E+06
2.77E+06
1.63E+07
1.12E+07
4.63E+06
3.22E+06
1.43E+06
1.80E+06
1.15E+06
3.83E+05
6.45E+05
1.96E+06
1.22E+06
2.25E+05
7.95E+05
5.05E+05
3.28E+06
2.36E+06
1.35E+05
2.02E+05
5.06E+05
1.59E+05
6.06E+05
1.34E+05
5.50E+05
6.84E+05
3.86E+06
2.42E+06
9.92E+05
7.31E+05
3.31E+05
5.15E+05
3.34E+05
1.15E+05
1.71E+05
5.52E+05
3.56E+05
3.53E+05
2.25E+05
2.35E+05
9.40E+04
3.33E+05
2.54E+05
1.42E+06
1.22E+06
6.31E+04
9.66E+04
2.24E+05
8.27E+04
2.98E+05
6.69E+04
2.32E+05
2.39E+05
1.31E+06
8.78E+05
4.85E+05
3.16E+05
1.41E+05
2.79E+05
1.56E+05
5.72E+04
9.49E+04
2.89E+05
1.86E+05
2.94E+06
2.27E+06
7.01E+05
2.93E+05
3.52E+05
1.69E+05
6.01E+05
5.41E+05
2.49E+06
2.06E+06
2.88E+04
4.35E+04
1.09E+05
2.69E+04
1.21E+05
2.47E+04
3.64E+05
3.15E+05
1.81E+06
1.21E+06
2.45E+05
1.66E+05
7.00E+04
1.37E+05
7.95E+04
2.14E+04
3.91E+04
1.14E+05
6.67E+04
2.48E+06
1.87E+06
6.00E+05
1.69E+05
1.90E+05
1.31E+05
5.12E+05
3.77E+05
1.98E+06
1.57E+06
5.27E+04
8.65E+04
2.01E+05
4.53E+04
1.72E+05
3.96E+04
1.16E+05
9.06E+04
5.74E+05
3.68E+05
4.28E+05
2.97E+05
1.40E+05
1.19E+05
7.90E+04
2.41E+04
4.67E+04
1.10E+05
6.61E+04
9.36E+05
6.07E+05
2.31E+05
1.21E+05
1.28E+05
4.70E+04
1.55E+05
1.41E+05
6.89E+05
5.11E+05
1.57E+04
2.73E+04
8.01E+04
1.84E+04
6.28E+04
1.45E+04
3.97E+04
4.17E+04
2.20E+05
1.42E+05
1.35E+05
8.70E+04
3.51E+04
1.19E+05
6.92E+04
2.16E+04
4.44E+04
1.40E+05
7.35E+04
4.22E+06
3.21E+06
9.68E+05
3.31E+05
3.45E+05
2.35E+05
8.17E+05
6.11E+05
3.45E+06
2.47E+06
2.92E+04
5.23E+04
1.23E+05
1.45E+04
7.08E+04
1.56E+04
1.36E+05
1.31E+05
7.68E+05
4.59E+05
2.67E+05
1.64E+05
6.31E+04
9.01E+04
4.43E+04
1.92E+04
2.39E+04
7.90E+04
4.48E+04
2.39E+01
5.10E+01
2.80E+01
1.11E+01
4.70E+01
6.00E+01
3.64E+01
4.35E+01
4.49E+01
2.00E+01
2.58E+01
2.82E+01
2.32E+01
2.58E+01
2.16E+01
1.05E+02
1.01E+02
1.09E+02
8.71E+00
7.61E+00
6.80E+01
6.26E+01
6.61E+01
4.98E+01
4.79E+01
9.73E+01
9.57E+01
3.39E+01
3.18E+01
2.45E+01
3.73E+01
1.34E+01
1.38E+01
1.32E+01
1.28E+01
4.58E+01
4.18E+01
6.40E+01
2.23E+01
3.74E+01
5.38E+01
1.82E+01
1.25E+01
1.57E+01
3.14E+01
2.57E+01
5.29E+01
1.14E+01
1.00E+01
9.20E+02
9.48E+02
9.89E+02
1.07E+03
9.24E+02
2.33E+01
2.41E+01
2.37E+01
2.47E+01
2.16E+01
3.51E+01
9.19E+01
8.39E+01
8.76E+01
4.38E+00
5.91E+00
5.53E+01
4.21E+01
4.96E+01
4.15E+01
4.03E+01
4.95E+01
4.39E+01
2.14E+01
1.46E+02
1.49E+02
1.59E+02
8.25E+01
7.53E+01
8.01E+01
7.89E+01
3.26E+02
3.05E+02
2.76E+02
1.35E+02
1.31E+02
1.37E+02
6.02E+01
5.41E+01
4.68E+01
2.91E+01
2.90E+01
2.23E+01
1.99E+01
1.24E+01
7.75E+02
8.58E+02
8.91E+02
9.33E+02
7.66E+02
1.81E+01
2.20E+01
1.80E+01
1.87E+01
2.29E+01
3.15E+01
2.18E+02
2.04E+02
2.00E+02
7.33E+00
4.22E+00
9.93E+01
8.84E+01
7.22E+01
8.36E+01
8.82E+01
2.23E+01
2.39E+01
1.32E+01
1.22E+02
1.90E+02
1.55E+02
7.49E+01
6.49E+01
7.72E+01
7.29E+01
3.45E+02
3.23E+02
2.61E+02
1.10E+02
1.10E+02
1.31E+02
6.22E+01
6.41E+01
6.81E+01
3.95E+01
5.23E+01
4.02E+01
2.49E+01
2.16E+01
1.40E+03
1.49E+03
1.80E+03
1.53E+03
1.09E+03
2.54E+01
2.44E+01
2.72E+01
2.83E+01
2.17E+01
3.70E+01
5.89E+02
5.24E+02
5.10E+02
1.25E+01
7.83E+00
2.14E+02
2.08E+02
2.52E+02
2.39E+02
2.18E+02
4.74E+01
2.93E+01
3.38E+01
2.16E+02
2.85E+02
2.70E+02
1.06E+02
9.86E+01
9.90E+01
9.08E+01
5.05E+02
4.28E+02
4.38E+02
1.52E+02
1.53E+02
1.71E+02
1.66E+02
1.32E+02
1.05E+02
3.94E+01
3.44E+01
8.25E+01
9.73E+01
7.73E+01
9.30E+02
1.17E+03
1.13E+03
1.19E+03
8.26E+02
1.67E+01
1.81E+01
1.31E+01
1.87E+01
1.77E+01
3.33E+01
1.11E+03
1.04E+03
1.05E+03
1.78E+01
1.54E+01
3.22E+02
4.13E+02
3.68E+02
4.50E+02
4.25E+02
1.66E+01
2.14E+01
1.67E+01
1.48E+02
1.82E+02
1.75E+02
1.08E+02
9.64E+01
1.06E+02
1.01E+02
4.30E+02
4.22E+02
3.94E+02
9.30E+01
1.09E+02
1.29E+02
9.61E+01
9.64E+01
9.33E+01
3.48E+01
2.91E+01
1.68E+02
1.73E+02
1.45E+02
1.21E+03
1.23E+03
1.23E+03
1.30E+03
1.01E+03
2.14E+01
2.10E+01
1.95E+01
2.40E+01
2.27E+01
3.73E+01
1.67E+03
1.58E+03
1.61E+03
3.07E+01
2.32E+01
6.01E+02
6.15E+02
5.55E+02
6.82E+02
6.20E+02
3.10E+01
2.73E+01
2.14E+01
9.63E+01
1.43E+02
1.44E+02
1.10E+02
9.74E+01
1.12E+02
1.06E+02
4.46E+02
4.12E+02
4.11E+02
1.58E+02
1.60E+02
1.52E+02
6.66E+01
6.06E+01
5.89E+01
1.69E+01
1.77E+01
1.05E+02
1.29E+02
1.23E+02
9.63E+02
8.78E+02
8.74E+02
9.52E+02
7.73E+02
3.07E+01
2.50E+01
2.70E+01
3.05E+01
3.15E+01
1.95E+01
1.44E+03
1.36E+03
1.41E+03
2.20E+01
2.25E+01
5.54E+02
5.20E+02
5.41E+02
5.73E+02
5.24E+02
6.96E+00
1.62E+01
1.10E+01
7.82E+01
1.04E+02
7.16E+01
7.09E+01
7.07E+01
6.53E+01
6.27E+01
3.78E+02
3.84E+02
3.47E+02
1.62E+02
1.04E+02
1.33E+02
5.72E+01
5.01E+01
5.66E+01
1.86E+01
1.78E+01
1.41E+02
1.17E+02
1.06E+02
4.05E+02
3.75E+02
4.59E+02
4.76E+02
4.15E+02
2.24E+01
1.74E+01
2.27E+01
2.55E+01
2.16E+01
1.72E+01
1.10E+03
1.01E+03
1.00E+03
1.74E+01
1.51E+01
4.16E+02
4.02E+02
5.57E+02
4.73E+02
4.18E+02
4.87E+00
4.19E+00
3.98E+00
2.09E+01
2.58E+01
1.76E+01
1.82E+01
1.97E+01
1.83E+01
1.81E+01
1.72E+02
1.42E+02
1.49E+02
7.16E+01
6.46E+01
5.48E+01
1.64E+01
1.40E+01
1.53E+01
%30%min%(121)
2.09E+04
4.65E+04
9.10E+06
9.47E+06
5.70E+07
3.57E+07
5.32E+05
9.11E+05
2.27E+06
4.74E+05
5.77E+05
1.49E+06
1.17E+06
4.90E+04
3.58E+05
3.59E+05
6.11E+04
1.10E+05
2.50E+05
4.02E+05
2.06E+05
5.73E+04
7.56E+04
1.07E+05
2.61E+05
1.14E+05
3.76E+04
2.32E+01
%10%min%(119)
2.58E+04
6.85E+04
4.40E+04
1.34E+05
5.49E+04
1.75E+05
9.72E+04
1.76E+06
5.24E+06
1.33E+06
8.27E+05
2.62E+06
2.03E+06
1.06E+07
8.41E+06
9.04E+05
4.06E+04
%5%min%(118)
2.39E+04
8.18E+04
4.57E+04
4.54E+04
2.69E+05
1.71E+05
1.50E+06
2.29E+06
6.13E+06
1.44E+06
1.79E+06
3.72E+06
2.80E+06
2.80E+04
1.41E+05
1.27E+05
4.38E+04
9.07E+04
2.01E+05
3.40E+05
2.03E+05
1.07E+06
9.97E+05
2.60E+06
4.12E+06
1.81E+06
6.46E+03
7.80E+04
%3%min%(117)
4.84E+04
1.46E+05
9.42E+04
1.31E+05
8.36E+04
3.06E+05
2.16E+05
6.13E+03
2.42E+04
4.25E+03
2.31E+06
7.49E+06
5.70E+06
3.25E+07
2.46E+07
3.07E+06
9.54E+04
%2%min%(116)
2.80E+04
9.69E+04
1.35E+05
1.35E+05
7.67E+05
4.91E+05
8.19E+02
9.70E+02
2.60E+03
4.79E+06
5.48E+06
1.23E+07
9.95E+06
3.96E+04
2.12E+05
2.18E+05
2.84E+05
4.56E+05
1.01E+06
1.67E+06
9.45E+05
3.57E+05
3.12E+05
8.36E+05
1.43E+06
6.21E+05
3.28E+04
1.73E+05
%1%min%(115)
3.81E+04
1.03E+05
5.74E+04
1.03E+05
7.67E+04
3.61E+05
2.29E+05
1.90E+04
7.77E+04
1.43E+04
3.16E+05
1.05E+06
9.90E+05
5.07E+06
3.47E+06
8.61E+06
7.46E+04
%30%sec%(114)
1.96E+04
1.28E+05
2.09E+05
2.06E+05
1.31E+06
8.20E+05
8.85E+02
3.12E+03
5.76E+03
2.85E+04
2.90E+04
3.74E+07
2.90E+07
6.77E+04
4.04E+05
3.88E+05
1.71E+05
2.56E+05
5.45E+05
9.60E+05
5.36E+05
2.06E+05
1.78E+05
4.70E+05
8.13E+05
3.40E+05
1.01E+05
1.78E+05
Amount
0%min%(113)
2.15E+04
6.32E+04
4.33E+04
8.68E+04
5.85E+04
4.69E+05
3.17E+05
2.59E+04
1.07E+05
2.84E+04
3.10E+05
1.03E+06
7.97E+05
4.91E+06
3.59E+06
3.06E+06
2.62E+05
%1%fmol%(121)
1.89E+04
7.56E+04
2.82E+05
2.65E+05
1.67E+06
1.06E+06
3.94E+03
5.78E+03
1.12E+04
2.87E+04
3.57E+04
1.38E+07
1.00E+07
4.50E+04
2.86E+05
2.52E+05
1.91E+05
3.25E+05
6.53E+05
1.03E+06
6.15E+05
3.88E+05
3.76E+05
9.60E+05
1.52E+06
7.63E+05
7.52E+04
5.14E+05
%3%fmol%(119)
2.68E+04
7.18E+04
4.06E+04
7.92E+04
4.89E+04
2.82E+05
1.75E+05
5.17E+04
1.84E+05
4.66E+04
3.41E+05
1.22E+06
9.04E+05
5.84E+06
4.19E+06
3.25E+06
1.64E+05
%10%fmol%(118)
2.71E+04
5.63E+04
1.47E+05
1.24E+05
8.03E+05
5.14E+05
2.91E+03
5.90E+03
1.18E+04
4.01E+04
3.63E+04
1.44E+07
1.11E+07
1.23E+05
6.61E+05
7.17E+05
3.78E+05
6.47E+05
1.38E+06
2.19E+06
1.20E+06
1.10E+06
9.88E+05
2.62E+06
4.07E+06
1.98E+06
2.12E+05
3.08E+05
%30%fmol%(117)
8.10E+03
1.76E+04
1.41E+04
1.32E+05
7.94E+04
2.44E+05
1.53E+05
3.44E+04
1.48E+05
3.90E+04
2.90E+05
1.13E+06
7.79E+05
5.18E+06
3.88E+06
3.72E+06
5.50E+05
%100%fmol%(116)
1.05E+04
8.19E+04
1.31E+05
1.23E+05
7.93E+05
4.78E+05
4.41E+03
3.69E+03
1.22E+04
3.12E+04
3.25E+04
1.68E+07
1.27E+07
4.28E+05
2.11E+06
2.30E+06
1.69E+06
2.75E+06
5.67E+06
9.31E+06
5.45E+06
4.30E+06
4.17E+06
1.03E+07
1.50E+07
7.17E+06
8.22E+05
1.08E+06
%300%fmol%(115)
2.29E+04
2.33E+04
3.11E+05
1.97E+05
2.93E+04
1.24E+05
2.59E+04
8.48E+04
3.01E+05
2.44E+05
1.41E+06
1.08E+06
3.34E+06
2.20E+06
%1%pmol%(114)
1.87E+04
1.97E+05
1.83E+05
1.11E+06
7.07E+05
2.59E+03
4.53E+03
6.01E+03
1.13E+04
8.99E+03
1.49E+07
1.13E+07
1.25E+06
6.72E+06
6.95E+06
4.35E+06
7.25E+06
1.50E+07
2.57E+07
1.59E+07
1.13E+07
1.08E+07
2.70E+07
4.13E+07
1.96E+07
2.74E+06
4.59E+06
STD%iTRAQ
3%pmol%(113)
4.62E+04
2.79E+04
4.53E+04
1.32E+05
3.63E+04
5.45E+04
1.60E+05
1.05E+05
6.25E+05
5.07E+05
8.63E+05
7.40E+06
%30%min%(121)
3.17E+04
3.36E+04
1.93E+05
1.18E+05
7.92E+03
1.07E+04
1.32E+04
8.36E+03
5.61E+03
4.06E+06
3.09E+06
4.54E+04
2.15E+05
2.06E+05
2.31E+05
3.78E+05
8.54E+05
1.61E+06
8.79E+05
4.83E+04
3.81E+04
1.11E+05
2.03E+05
8.31E+04
1.25E+04
1.43E+07
%10%min%(119)
9.83E+03
2.29E+04
8.80E+03
3.67E+04
1.05E+05
9.28E+04
4.18E+05
3.16E+05
4.23E+05
2.80E+04
1.88E+05
1.97E+05
4.04E+05
6.69E+05
1.35E+06
2.56E+06
1.35E+06
4.87E+04
4.14E+04
1.10E+05
1.94E+05
9.98E+04
1.04E+04
3.33E+04
%5%min%(118)
1.54E+04
2.34E+04
2.21E+04
6.74E+03
1.07E+04
1.90E+06
1.40E+06
4.95E+04
2.52E+05
2.29E+05
5.18E+05
9.38E+05
2.10E+06
3.51E+06
1.73E+06
3.46E+04
3.48E+04
8.75E+04
1.52E+05
7.07E+04
2.02E+04
7.31E+04
%3%min%(117)
5.26E+04
1.54E+05
1.24E+05
5.31E+05
3.88E+05
2.54E+05
3.16E+04
1.91E+05
1.64E+05
6.31E+05
1.23E+06
2.51E+06
4.33E+06
1.92E+06
4.28E+04
4.14E+04
7.63E+04
1.59E+05
7.47E+04
2.88E+04
2.74E+04
%2%min%(116)
1.42E+04
1.43E+04
1.08E+06
7.84E+05
7.03E+03
2.51E+04
2.49E+04
4.18E+05
8.29E+05
1.82E+06
2.85E+06
1.38E+06
2.86E+04
2.96E+04
7.25E+04
1.24E+05
4.96E+04
1.40E+04
5.47E+04
%1%min%(115)
3.12E+05
5.87E+03
2.53E+04
1.66E+04
3.20E+05
5.99E+05
1.36E+06
2.20E+06
1.12E+06
2.83E+04
3.33E+04
7.25E+04
1.04E+05
6.08E+04
1.66E+04
4.97E+04
%30%sec%(114)
1.31E+06
9.78E+05
1.79E+04
1.97E+04
1.83E+04
4.59E+05
9.07E+05
1.94E+06
3.41E+06
1.85E+06
4.39E+04
5.02E+04
1.22E+05
1.84E+05
7.84E+04
2.23E+04
1.11E+05
ENDO%iTRAQ
0%min%(113)
1.73E+04
5.10E+04
2.09E+04
2.73E+04
5.68E+04
7.12E+04
1.46E+05
9.26E+04
4.40E+04
5.31E+04
1.46E+05
1.84E+05
9.66E+04
1.60E+04
7.63E+04
128
1.70E+05
5.30E+04
9.26E+04
4.44E+04
7.92E+04
5.04E+04
1.17E+05
4.70E+04
8.90E+04
Run
R5
R4
R3
R2
R1
R5
R4
R3
R2
R1
R5
R4
R3
R2
R1
R5
R4
R3
R2
R1
R5
R4
R3
R2
R1
R5
R4
R3
R2
R1
R5
R4
R3
R2
R1
R5
R4
R3
R2
R1
R5
R4
R3
R2
R1
R5
R4
R3
R2
R1
R5
R4
R3
R2
R1
R5
R4
R3
R2
R1
R5
R4
R3
R2
R1
Sequence
ySSDPTGALTEDSIDDTFLPVPEYINQSVPK
ySSDPTGALTEDSIDDTFLPVPEYINQSVPK
ySSDPTGALTEDSIDDTFLPVPEYINQSVPK
ySSDPTGALTEDSIDDTFLPVPEYINQSVPK
ySSDPTGALTEDSIDDTFLPVPEYINQSVPK
YSSDPTGALTEDSIDDTFLPVPEyINQSVPK
YSSDPTGALTEDSIDDTFLPVPEyINQSVPK
YSSDPTGALTEDSIDDTFLPVPEyINQSVPK
YSSDPTGALTEDSIDDTFLPVPEyINQSVPK
YSSDPTGALTEDSIDDTFLPVPEyINQSVPK
GSHQISLDNPDyQQDFFPK
GSHQISLDNPDyQQDFFPK
GSHQISLDNPDyQQDFFPK
GSHQISLDNPDyQQDFFPK
GSHQISLDNPDyQQDFFPK
GSTAENAEyLR
GSTAENAEyLR
GSTAENAEyLR
GSTAENAEyLR
GSTAENAEyLR
ySSDPTGALTEDSIDDTFLPVPEyINQSVPK
ySSDPTGALTEDSIDDTFLPVPEyINQSVPK
ySSDPTGALTEDSIDDTFLPVPEyINQSVPK
ySSDPTGALTEDSIDDTFLPVPEyINQSVPK
ySSDPTGALTEDSIDDTFLPVPEyINQSVPK
GSHQIsLDNPDyQQDFFPK
GSHQIsLDNPDyQQDFFPK
GSHQIsLDNPDyQQDFFPK
GSHQIsLDNPDyQQDFFPK
GSHQIsLDNPDyQQDFFPK
IADPEHDHTGFLTEyVATR
IADPEHDHTGFLTEyVATR
IADPEHDHTGFLTEyVATR
IADPEHDHTGFLTEyVATR
IADPEHDHTGFLTEyVATR
IADPEHDHTGFLtEyVATR
IADPEHDHTGFLtEyVATR
IADPEHDHTGFLtEyVATR
IADPEHDHTGFLtEyVATR
IADPEHDHTGFLtEyVATR
VADPDHDHTGFLTEyVATR
VADPDHDHTGFLTEyVATR
VADPDHDHTGFLTEyVATR
VADPDHDHTGFLTEyVATR
VADPDHDHTGFLTEyVATR
VADPDHDHTGFLtEyVATR
VADPDHDHTGFLtEyVATR
VADPDHDHTGFLtEyVATR
VADPDHDHTGFLtEyVATR
VADPDHDHTGFLtEyVATR
ELFDDPSyVNVQNLDK
ELFDDPSyVNVQNLDK
ELFDDPSyVNVQNLDK
ELFDDPSyVNVQNLDK
ELFDDPSyVNVQNLDK
IQNTGDyYDLYGGEK
IQNTGDyYDLYGGEK
IQNTGDyYDLYGGEK
IQNTGDyYDLYGGEK
IQNTGDyYDLYGGEK
LIEDNEyTAR
LIEDNEyTAR
LIEDNEyTAR
LIEDNEyTAR
LIEDNEyTAR
Site
1045
1045
1045
1045
1045
1068
1068
1068
1068
1068
1148
1148
1148
1148
1148
1173
1173
1173
1173
1173
1045,%1068
1045,%1068
1045,%1068
1045,%1068
1045,%1068
1142,%1148
1142,%1148
1142,%1148
1142,%1148
1142,%1148
204
204
204
204
204
202,%204
202,%204
202,%204
202,%204
202,%204
187
187
187
187
187
185,%187
185,%187
185,%187
185,%187
185,%187
317
317
317
317
317
62
62
62
62
62
418
418
418
418
418
Table 3-­‐1: MRM Quantification of endogenous tyrosine phosphorylated peptides. Raw iTRAQ intensities for endogenous peptides (columns E-­‐L) and corresponding heavy-­‐
labeled standard peptides (columns M-­‐T) across technical replicate analysis of EGF stimulated MCF10A cells. Calculated amounts of endogenous peptides based on regression to the standard curve are contained in columns U-­‐AB. Protein
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
ERK1
ERK1
ERK1
ERK1
ERK1
ERK1
ERK2
ERK2
ERK2
ERK2
ERK2
ERK2
Shc
Shc
Shc
Shp2
Shp2
Shp2
Src
Src
Src
Site
1045
1045
1045
1068
1068
1068
1148
1148
1148
1173
1173
1173
1045,41068
1045,41068
1045,41068
1142,41148
1142,41148
1142,41148
204
204
204
202,4204
202,4204
202,4204
187
187
187
185,4187
185,4187
185,4187
317
317
317
62
62
62
418
418
418
Full$Apex
Full$Average
SIM$Apex
SIM$Average
Endo4iTRAQ
Sequence
Run
ENDO
STD
Ratio
ENDO
STD
Ratio
ENDO
STD
Ratio
ENDO
STD
Ratio
Average4Ratio Multiple
Total4ENDO
113
ySSDPTGALTEDSIDDTFLPVPEYINQSVPK
R1
1.73E+06
2.39E+07
7.24E$02
7.21E+05
9.87E+06
7.30E$02
3.13E+06
3.81E+07
8.22E$02
1.18E+06
1.46E+07
8.08E$02
7.71E$02
1.29E+00
4.44E+02
5.13E+03
ySSDPTGALTEDSIDDTFLPVPEYINQSVPK
R2
7.22E+05
9.96E+06
7.25E$02
4.64E+05
6.37E+06
7.29E$02
8.02E+05
1.51E+07
5.30E$02
9.32E+05
1.40E+07
6.64E$02
6.62E$02
1.29E+00
3.81E+02
4.37E+03
ySSDPTGALTEDSIDDTFLPVPEYINQSVPK
R3
7.59E+06
1.06E+08
7.15E$02
5.38E+06
8.16E+07
6.60E$02
7.80E+06
9.26E+07
8.42E$02
7.42E+06
8.46E+07
8.78E$02
7.73E$02
1.29E+00
4.45E+02
2.11E+04
YSSDPTGALTEDSIDDTFLPVPEyINQSVPK
R1
1.04E+06
2.04E+07
5.10E$02
8.00E+05
1.81E+07
4.42E$02
1.99E+06
3.49E+07
5.70E$02
8.52E+05
1.67E+07
5.10E$02
5.08E$02
1.65E+00
3.73E+02
4.76E+03
YSSDPTGALTEDSIDDTFLPVPEyINQSVPK
R2
5.35E+05
1.16E+07
4.62E$02
2.28E+05
4.91E+06
4.64E$02
1.26E+06
2.45E+07
5.16E$02
8.24E+05
1.63E+07
5.05E$02
4.87E$02
1.65E+00
3.57E+02
2.95E+03
YSSDPTGALTEDSIDDTFLPVPEyINQSVPK
R3
5.09E+06
9.64E+07
5.28E$02
4.27E+06
8.68E+07
4.92E$02
6.02E+06
9.96E+07
6.05E$02
5.59E+06
9.25E+07
6.04E$02
5.57E$02
1.65E+00
4.09E+02
1.29E+04
GSHQISLDNPDyQQDFFPK
R1
2.18E+06
2.19E+07
9.95E$02
1.61E+06
1.62E+07
9.94E$02
2.53E+06
2.44E+07
1.04E$01
1.51E+06
1.49E+07
1.01E$01
1.01E$01
4.22E+00
1.89E+03
8.14E+03
GSHQISLDNPDyQQDFFPK
R2
9.55E+05
9.75E+06
9.79E$02
6.61E+05
6.54E+06
1.01E$01
1.91E+06
1.77E+07
1.08E$01
1.14E+06
1.06E+07
1.08E$01
1.04E$01
4.22E+00
1.94E+03
9.35E+03
GSHQISLDNPDyQQDFFPK
R3
1.25E+07
1.31E+08
9.54E$02
6.20E+06
6.93E+07
8.95E$02
1.25E+07
1.37E+08
9.12E$02
9.14E+06
9.36E+07
9.76E$02
9.34E$02
4.22E+00
1.75E+03
2.57E+04
GSTAENAEyLR R1
7.53E+06
1.18E+08
6.38E$02
4.94E+06
8.06E+07
6.13E$02
5.54E+06
7.05E+07
7.86E$02
4.19E+06
5.44E+07
7.70E$02
7.02E$02
1.29E+00
4.01E+02
4.41E+04
GSTAENAEyLR R2
4.07E+06
6.30E+07
6.46E$02
3.12E+06
5.01E+07
6.23E$02
6.34E$02
1.29E+00
3.62E+02
3.65E+04
GSTAENAEyLR R3
3.80E+07
6.17E+08
6.16E$02
2.69E+07
4.37E+08
6.16E$02
6.16E$02
1.29E+00
3.52E+02
1.60E+05
ySSDPTGALTEDSIDDTFLPVPEyINQSVPK
R1
7.90E+05
1.19E+07
6.64E$02
3.25E+05
4.71E+06
6.90E$02
1.07E+06
1.32E+07
8.11E$02
7.61E+05
1.15E+07
6.62E$02
7.07E$02
1.17E+00
3.66E+02
1.20E+04
ySSDPTGALTEDSIDDTFLPVPEyINQSVPK
R2
3.99E+05
5.28E+06
7.56E$02
2.39E+05
3.30E+06
7.24E$02
9.99E+05
1.49E+07
6.69E$02
3.68E+05
5.52E+06
6.67E$02
7.04E$02
1.17E+00
3.65E+02
7.83E+03
ySSDPTGALTEDSIDDTFLPVPEyINQSVPK
R3
4.06E+06
5.14E+07
7.91E$02
2.78E+06
3.74E+07
7.42E$02
5.85E+06
8.05E+07
7.26E$02
3.47E+06
4.74E+07
7.33E$02
7.48E$02
1.17E+00
3.88E+02
2.21E+04
GSHQIsLDNPDyQQDFFPK
R1
7.31E+04
1.63E+07
4.48E$03
4.60E+04
1.17E+07
3.93E$03
3.10E+04
2.64E+07
1.17E$03
3.12E+04
2.16E+07
1.44E$03
2.76E$03
3.20E+00
3.92E+01
7.11E+03
GSHQIsLDNPDyQQDFFPK
R2
3.74E+04
6.96E+06
5.38E$03
2.70E+04
6.66E+06
4.06E$03
2.40E+04
1.86E+07
1.29E$03
1.24E+04
1.11E+07
1.12E$03
2.96E$03
3.20E+00
4.21E+01
3.55E+04
GSHQIsLDNPDyQQDFFPK
R3
4.09E+05
1.18E+08
3.47E$03
3.03E+05
1.10E+08
2.75E$03
4.92E+05
1.83E+08
2.69E$03
3.01E+05
1.18E+08
2.55E$03
2.87E$03
3.20E+00
4.08E+01
4.33E+04
IADPEHDHTGFLTEyVATR
R1
1.12E+07
3.13E+07
3.58E$01
9.26E+06
2.59E+07
3.58E$01
1.40E+07
3.56E+07
3.93E$01
7.89E+06
2.08E+07
3.79E$01
3.72E$01
1.02E+00
1.68E+03
1.28E+04
IADPEHDHTGFLTEyVATR
R2
1.17E+07
3.04E+07
3.85E$01
5.69E+06
1.51E+07
3.78E$01
1.49E+07
3.83E+07
3.88E$01
9.44E+06
2.31E+07
4.09E$01
3.90E$01
1.02E+00
1.76E+03
1.90E+04
IADPEHDHTGFLTEyVATR
R3
4.97E+07
1.41E+08
3.52E$01
3.43E+07
9.77E+07
3.51E$01
3.95E+07
1.01E+08
3.91E$01
2.92E+07
7.97E+07
3.66E$01
3.65E$01
1.02E+00
1.65E+03
4.86E+04
IADPEHDHTGFLtEyVATR
R1
6.07E+05
3.88E+07
1.56E$02
6.58E+05
6.96E+07
9.45E$03
1.25E$02
1.24E+00
6.92E+01
3.24E+03
IADPEHDHTGFLtEyVATR
R2
4.04E+05
3.32E+07
1.21E$02
3.66E+05
3.00E+07
1.22E$02
1.22E$02
1.24E+00
6.71E+01
4.78E+03
IADPEHDHTGFLtEyVATR
R3
2.13E+06
1.53E+08
1.39E$02
1.73E+06
1.33E+08
1.30E$02
1.35E$02
1.24E+00
7.42E+01
8.45E+03
VADPDHDHTGFLTEyVATR
R1
5.69E+07
6.27E+07
9.07E$01
5.28E+07
5.56E+07
9.50E$01
4.86E+07
4.73E+07
1.03E+00
2.53E+07
2.93E+07
8.63E$01
9.37E$01
1.03E+00
4.28E+03
4.24E+04
VADPDHDHTGFLTEyVATR
R2
7.63E+07
7.47E+07
1.02E+00
4.45E+07
4.38E+07
1.02E+00
7.96E+07
7.16E+07
1.11E+00
3.78E+07
3.77E+07
1.00E+00
1.04E+00
1.03E+00
4.74E+03
7.04E+04
VADPDHDHTGFLTEyVATR
R3
2.24E+08
2.18E+08
1.03E+00
1.17E+08
1.14E+08
1.03E+00
1.54E+08
1.45E+08
1.06E+00
1.10E+08
1.04E+08
1.06E+00
1.04E+00
1.03E+00
4.77E+03
1.21E+05
VADPDHDHTGFLtEyVATR
R1
5.66E+06
8.23E+07
6.88E$02
2.36E+06
3.59E+07
6.57E$02
5.02E+06
6.68E+07
7.51E$02
4.27E+06
6.26E+07
6.82E$02
6.95E$02
9.84E$01
3.04E+02
8.96E+03
VADPDHDHTGFLtEyVATR
R2
2.19E+06
3.43E+07
6.37E$02
1.31E+06
2.10E+07
6.23E$02
3.54E+06
5.31E+07
6.67E$02
2.60E+06
3.68E+07
7.06E$02
6.58E$02
9.84E$01
2.88E+02
1.05E+04
VADPDHDHTGFLtEyVATR
R3
1.11E+07
1.65E+08
6.73E$02
7.54E+06
1.14E+08
6.61E$02
1.12E+07
1.67E+08
6.71E$02
9.93E+06
1.47E+08
6.76E$02
6.70E$02
9.84E$01
2.93E+02
2.10E+04
ELFDDPSyVNVQNLDK
R1
1.67E+07
4.00E+07
4.18E$01
1.34E+07
3.10E+07
4.32E$01
3.48E+07
8.02E+07
4.34E$01
1.50E+07
3.54E+07
4.24E$01
4.27E$01
2.60E+00
4.93E+03
1.51E+04
ELFDDPSyVNVQNLDK
R2
1.02E+07
2.28E+07
4.45E$01
7.25E+06
1.72E+07
4.21E$01
1.67E+07
4.15E+07
4.01E$01
1.38E+07
3.30E+07
4.19E$01
4.22E$01
2.60E+00
4.87E+03
1.52E+04
ELFDDPSyVNVQNLDK
R3
1.22E+08
2.74E+08
4.45E$01
8.57E+07
1.96E+08
4.37E$01
1.16E+08
3.60E+08
3.22E$01
9.42E+07
2.33E+08
4.04E$01
4.02E$01
2.60E+00
4.64E+03
5.97E+04
IQNTGDyYDLYGGEK
R1
1.88E+05
1.45E+07
1.30E$02
1.34E+05
1.14E+07
1.18E$02
2.59E+05
1.85E+07
1.40E$02
1.88E+05
1.28E+07
1.47E$02
1.34E$02
1.78E+00
1.05E+02
5.39E+03
IQNTGDyYDLYGGEK
R2
1.44E+05
1.18E+07
1.22E$02
8.04E+04
8.00E+06
1.00E$02
1.11E$02
1.78E+00
8.79E+01
6.13E+03
IQNTGDyYDLYGGEK
R3
6.28E+06
4.82E+08
1.30E$02
3.88E+06
2.91E+08
1.33E$02
3.72E+06
2.58E+08
1.44E$02
2.84E+06
1.72E+08
1.65E$02
1.43E$02
1.78E+00
1.13E+02
9.43E+04
LIEDNEyTAR R1
2.52E+05
8.94E+06
2.82E$02
2.08E+05
7.84E+06
2.65E$02
2.74E$02
1.24E+00
1.51E+02
4.37E+05
LIEDNEyTAR R2
1.53E+05
4.44E+06
3.44E$02
1.25E+05
3.59E+06
3.49E$02
1.23E+05
7.28E+06
1.70E$02
1.27E+05
7.08E+06
1.79E$02
2.60E$02
1.24E+00
1.44E+02
2.89E+05
LIEDNEyTAR R3
2.48E+06
1.03E+08
2.41E$02
1.83E+06
7.31E+07
2.50E$02
2.46E$02
1.24E+00
1.36E+02
1.12E+06
114
4.59E+04
2.63E+04
1.45E+05
2.92E+04
1.18E+04
8.16E+04
3.93E+04
4.01E+04
1.45E+05
2.05E+04
2.02E+05
9.22E+05
1.98E+04
1.17E+04
7.14E+04
4.86E+03
2.50E+04
2.48E+04
1.30E+04
2.39E+04
4.86E+04
2.05E+03
3.66E+03
4.99E+03
4.69E+04
6.89E+04
1.48E+05
5.62E+03
7.40E+03
1.86E+04
5.81E+05
4.15E+05
1.81E+06
5.11E+03
5.02E+03
9.22E+04
1.70E+05
1.11E+05
5.22E+05
115
3.41E+04
1.74E+04
8.77E+04
2.22E+04
1.21E+04
6.22E+04
2.57E+04
2.79E+04
9.97E+04
1.39E+05
1.47E+05
7.05E+05
1.13E+04
9.36E+03
4.68E+04
2.83E+03
1.56E+04
1.15E+04
1.12E+04
2.10E+04
4.43E+04
1.59E+03
1.73E+03
2.42E+03
4.01E+04
6.08E+04
1.17E+05
4.90E+03
4.16E+03
9.10E+03
4.12E+05
2.74E+05
1.37E+06
4.00E+03
3.80E+03
6.33E+04
3.24E+04
2.38E+04
9.44E+04
116
4.02E+04
1.94E+04
9.61E+04
3.76E+04
1.83E+04
9.73E+04
3.16E+04
3.44E+04
1.25E+05
1.40E+05
1.51E+05
7.19E+05
1.10E+04
9.21E+03
4.74E+04
3.98E+03
1.62E+04
1.16E+04
4.09E+04
6.13E+04
1.38E+05
2.32E+03
2.77E+03
4.66E+03
1.65E+05
2.40E+05
4.25E+05
1.10E+04
9.18E+03
1.82E+04
5.32E+05
3.58E+05
1.76E+06
4.57E+03
3.77E+03
6.77E+04
1.29E+04
8.83E+03
3.65E+04
117
4.80E+04
2.63E+04
1.25E+05
4.90E+04
2.76E+04
1.42E+05
4.93E+04
5.56E+04
1.92E+05
3.06E+05
3.13E+05
1.49E+06
1.38E+04
1.35E+04
6.13E+04
5.62E+03
1.87E+04
2.14E+04
2.81E+05
4.21E+05
8.67E+05
2.66E+04
2.33E+04
4.45E+04
1.24E+06
1.75E+06
2.99E+06
1.30E+05
1.10E+05
2.09E+05
7.18E+05
4.97E+05
2.39E+06
4.60E+03
5.16E+03
8.32E+04
5.93E+03
4.38E+03
1.68E+04
118
4.88E+04
2.48E+04
1.34E+05
2.24E+04
1.21E+04
6.98E+04
3.52E+04
3.80E+04
1.43E+05
2.24E+05
2.35E+05
1.10E+06
1.49E+04
1.16E+04
6.23E+04
3.35E+03
1.53E+04
1.39E+04
2.71E+05
4.13E+05
8.61E+05
2.77E+04
2.47E+04
4.86E+04
1.25E+06
1.81E+06
3.05E+06
1.52E+05
1.29E+05
2.60E+05
4.93E+05
3.39E+05
1.61E+06
3.39E+03
3.42E+03
6.78E+04
1.62E+03
1.42E+03
4.83E+03
119
4.81E+04
2.59E+04
1.19E+05
1.74E+04
8.49E+03
4.62E+04
3.40E+04
3.45E+04
1.26E+05
1.46E+05
1.48E+05
6.91E+05
1.41E+04
9.98E+03
6.00E+04
2.87E+03
1.60E+04
6.95E+03
2.63E+05
3.93E+05
8.08E+05
2.51E+04
2.29E+04
4.42E+04
1.18E+06
1.69E+06
2.96E+06
1.39E+05
1.17E+05
2.15E+05
4.16E+05
2.84E+05
1.38E+06
4.75E+03
4.14E+03
8.09E+04
1.85E+03
6.52E+02
2.53E+03
Table 3-­‐2: PRM Quantification of endogenous tyrosine phosphorylated peptides. Raw MS1 intensities were collected for endogenous and heavy-­‐labeled standard peptides in either full scan (columns E-­‐J) or SIM mode (columns K-­‐P) for either single scan (E-­‐G, K-­‐M) or averaged (H-­‐J, N-­‐P) across the chromatographic elution profile. These estimates were averaged to produce the final ratio (column Q). Amino acid analysis was performed for each standard peptide to provide a correction factor for the concentration (column R). Total amount of standard after correction (column S) was multiple by the ratio in column Q to produce the total amount of endogenous (column T). iTRAQ intensities for endogenous peptides (columns U-­‐AB) were used to calculate the fractional iTRAQ values (AC-­‐AJ), which were then multiplied by the total endogenous peptide amount (column T) to produce the calculated amount of endogenous peptide in each sample (columns AK-­‐AR). Table continued on next page. 129
121
2.49E+04
1.23E+04
6.33E+04
5.06E+03
2.74E+03
1.58E+04
1.61E+04
1.67E+04
6.27E+04
4.21E+04
4.80E+04
2.14E+05
8.97E+03
5.17E+03
3.11E+04
2.51E+03
1.74E+04
3.18E+03
2.00E+05
3.38E+05
6.64E+05
2.16E+04
2.03E+04
3.94E+04
8.73E+05
1.36E+06
2.30E+06
1.21E+05
1.05E+05
2.12E+05
2.07E+05
1.65E+05
6.59E+05
4.75E+03
4.74E+03
8.61E+04
3.77E+02
4.44E+02
1.65E+03
Protein
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
ERK1
ERK1
ERK1
ERK1
ERK1
ERK1
ERK2
ERK2
ERK2
ERK2
ERK2
ERK2
Shc
Shc
Shc
Shp2
Shp2
Shp2
Src
Src
Src
Site
1045
1045
1045
1068
1068
1068
1148
1148
1148
1173
1173
1173
1045,&1068
1045,&1068
1045,&1068
1142,&1148
1142,&1148
1142,&1148
204
204
204
202,&204
202,&204
202,&204
187
187
187
185,&187
185,&187
185,&187
317
317
317
62
62
62
418
418
418
iTRAQ&Fraction
Amount&(fmol)
Sequence
Run
113/Total
114/Total
115/Total
116/Total
117/Total
118/Total
119/Total
121/Total
113
ySSDPTGALTEDSIDDTFLPVPEYINQSVPK
R1
1.74EN02
1.56EN01
1.16EN01
1.36EN01
1.63EN01
1.65EN01
1.63EN01
8.44EN02
7.71E+00
ySSDPTGALTEDSIDDTFLPVPEYINQSVPK
R2
2.79EN02
1.68EN01
1.11EN01
1.24EN01
1.68EN01
1.58EN01
1.65EN01
7.86EN02
1.06E+01
ySSDPTGALTEDSIDDTFLPVPEYINQSVPK
R3
2.68EN02
1.83EN01
1.11EN01
1.22EN01
1.58EN01
1.69EN01
1.50EN01
8.02EN02
1.19E+01
YSSDPTGALTEDSIDDTFLPVPEyINQSVPK
R1
2.54EN02
1.56EN01
1.18EN01
2.00EN01
2.61EN01
1.19EN01
9.27EN02
2.70EN02
9.45E+00
YSSDPTGALTEDSIDDTFLPVPEyINQSVPK
R2
3.08EN02
1.23EN01
1.26EN01
1.90EN01
2.87EN01
1.26EN01
8.84EN02
2.85EN02
1.10E+01
YSSDPTGALTEDSIDDTFLPVPEyINQSVPK
R3
2.45EN02
1.55EN01
1.18EN01
1.84EN01
2.69EN01
1.32EN01
8.74EN02
3.00EN02
1.00E+01
GSHQISLDNPDyQQDFFPK
R1
3.40EN02
1.64EN01
1.07EN01
1.32EN01
2.06EN01
1.47EN01
1.42EN01
6.73EN02
6.44E+01
GSHQISLDNPDyQQDFFPK
R2
3.65EN02
1.56EN01
1.09EN01
1.34EN01
2.17EN01
1.48EN01
1.34EN01
6.51EN02
7.09E+01
GSHQISLDNPDyQQDFFPK
R3
2.80EN02
1.58EN01
1.08EN01
1.36EN01
2.09EN01
1.56EN01
1.37EN01
6.82EN02
4.90E+01
GSTAENAEyLR R1
4.15EN02
1.93EN02
1.31EN01
1.32EN01
2.88EN01
2.11EN01
1.38EN01
3.97EN02
1.66E+01
GSTAENAEyLR R2
2.85EN02
1.58EN01
1.15EN01
1.18EN01
2.44EN01
1.83EN01
1.16EN01
3.75EN02
1.03E+01
GSTAENAEyLR R3
2.68EN02
1.54EN01
1.17EN01
1.20EN01
2.48EN01
1.83EN01
1.15EN01
3.57EN02
9.41E+00
ySSDPTGALTEDSIDDTFLPVPEyINQSVPK
R1
1.13EN01
1.87EN01
1.07EN01
1.04EN01
1.30EN01
1.41EN01
1.33EN01
8.47EN02
4.15E+01
ySSDPTGALTEDSIDDTFLPVPEyINQSVPK
R2
1.00EN01
1.50EN01
1.19EN01
1.18EN01
1.72EN01
1.48EN01
1.27EN01
6.60EN02
3.65E+01
ySSDPTGALTEDSIDDTFLPVPEyINQSVPK
R3
5.50EN02
1.77EN01
1.16EN01
1.18EN01
1.52EN01
1.55EN01
1.49EN01
7.72EN02
2.13E+01
GSHQIsLDNPDyQQDFFPK
R1
2.15EN01
1.47EN01
8.54EN02
1.20EN01
1.70EN01
1.01EN01
8.66EN02
7.58EN02
8.42E+00
GSHQIsLDNPDyQQDFFPK
R2
2.22EN01
1.57EN01
9.77EN02
1.01EN01
1.17EN01
9.58EN02
1.00EN01
1.09EN01
9.35E+00
GSHQIsLDNPDyQQDFFPK
R3
3.17EN01
1.82EN01
8.44EN02
8.46EN02
1.57EN01
1.02EN01
5.08EN02
2.32EN02
1.29E+01
IADPEHDHTGFLTEyVATR
R1
1.17EN02
1.19EN02
1.02EN02
3.74EN02
2.57EN01
2.48EN01
2.41EN01
1.83EN01
1.97E+01
IADPEHDHTGFLTEyVATR
R2
1.12EN02
1.41EN02
1.24EN02
3.63EN02
2.49EN01
2.45EN01
2.32EN01
2.00EN01
1.97E+01
IADPEHDHTGFLTEyVATR
R3
1.40EN02
1.40EN02
1.27EN02
3.96EN02
2.49EN01
2.47EN01
2.32EN01
1.91EN01
2.30E+01
IADPEHDHTGFLtEyVATR
R1
2.94EN02
1.86EN02
1.44EN02
2.11EN02
2.41EN01
2.51EN01
2.28EN01
1.96EN01
2.03E+00
IADPEHDHTGFLtEyVATR
R2
4.60EN02
3.51EN02
1.67EN02
2.66EN02
2.24EN01
2.37EN01
2.20EN01
1.95EN01
3.08E+00
IADPEHDHTGFLtEyVATR
R3
4.29EN02
2.53EN02
1.23EN02
2.36EN02
2.26EN01
2.46EN01
2.24EN01
2.00EN01
3.18E+00
VADPDHDHTGFLTEyVATR
R1
8.77EN03
9.70EN03
8.29EN03
3.41EN02
2.56EN01
2.58EN01
2.44EN01
1.80EN01
3.75E+01
VADPDHDHTGFLTEyVATR
R2
9.98EN03
9.76EN03
8.62EN03
3.41EN02
2.48EN01
2.57EN01
2.40EN01
1.93EN01
4.74E+01
VADPDHDHTGFLTEyVATR
R3
9.98EN03
1.23EN02
9.64EN03
3.51EN02
2.47EN01
2.52EN01
2.45EN01
1.90EN01
4.76E+01
VADPDHDHTGFLtEyVATR
R1
1.57EN02
9.82EN03
8.56EN03
1.92EN02
2.27EN01
2.66EN01
2.43EN01
2.11EN01
4.75E+00
VADPDHDHTGFLtEyVATR
R2
2.14EN02
1.50EN02
8.45EN03
1.87EN02
2.23EN01
2.62EN01
2.38EN01
2.14EN01
6.16E+00
VADPDHDHTGFLtEyVATR
R3
2.18EN02
1.93EN02
9.45EN03
1.89EN02
2.17EN01
2.70EN01
2.23EN01
2.20EN01
6.38E+00
ELFDDPSyVNVQNLDK
R1
4.48EN03
1.72EN01
1.22EN01
1.58EN01
2.13EN01
1.46EN01
1.23EN01
6.13EN02
2.21E+01
ELFDDPSyVNVQNLDK
R2
6.46EN03
1.77EN01
1.17EN01
1.52EN01
2.12EN01
1.44EN01
1.21EN01
7.01EN02
3.15E+01
ELFDDPSyVNVQNLDK
R3
5.41EN03
1.64EN01
1.24EN01
1.59EN01
2.17EN01
1.46EN01
1.25EN01
5.97EN02
2.51E+01
IQNTGDyYDLYGGEK
R1
1.47EN01
1.40EN01
1.09EN01
1.25EN01
1.26EN01
9.27EN02
1.30EN01
1.30EN01
1.55E+01
IQNTGDyYDLYGGEK
R2
1.69EN01
1.39EN01
1.05EN01
1.04EN01
1.43EN01
9.46EN02
1.14EN01
1.31EN01
1.49E+01
IQNTGDyYDLYGGEK
R3
1.48EN01
1.45EN01
9.96EN02
1.07EN01
1.31EN01
1.07EN01
1.27EN01
1.35EN01
1.68E+01
LIEDNEyTAR R1
6.60EN01
2.57EN01
4.89EN02
1.95EN02
8.96EN03
2.45EN03
2.79EN03
5.69EN04
9.97E+01
LIEDNEyTAR R2
6.57EN01
2.53EN01
5.40EN02
2.01EN02
9.95EN03
3.21EN03
1.48EN03
1.01EN03
9.44E+01
LIEDNEyTAR R3
6.23EN01
2.90EN01
5.24EN02
2.03EN02
9.31EN03
2.68EN03
1.40EN03
9.13EN04
8.45E+01
114
6.90E+01
6.39E+01
8.14E+01
5.80E+01
4.38E+01
6.32E+01
3.11E+02
3.04E+02
2.76E+02
7.74E+00
5.73E+01
5.40E+01
6.85E+01
5.47E+01
6.88E+01
5.76E+00
6.59E+00
7.40E+00
2.00E+01
2.48E+01
2.31E+01
1.29E+00
2.36E+00
1.88E+00
4.15E+01
4.63E+01
5.85E+01
2.98E+00
4.32E+00
5.66E+00
8.49E+02
8.61E+02
7.62E+02
1.47E+01
1.22E+01
1.64E+01
3.88E+01
3.64E+01
3.93E+01
115
5.13E+01
4.22E+01
4.94E+01
4.41E+01
4.51E+01
4.81E+01
2.03E+02
2.12E+02
1.90E+02
5.25E+01
4.16E+01
4.13E+01
3.91E+01
4.36E+01
4.51E+01
3.35E+00
4.12E+00
3.44E+00
1.72E+01
2.19E+01
2.10E+01
9.98EN01
1.12E+00
9.11EN01
3.55E+01
4.09E+01
4.60E+01
2.60E+00
2.43E+00
2.77E+00
6.02E+02
5.69E+02
5.76E+02
1.15E+01
9.24E+00
1.13E+01
7.39E+00
7.76E+00
7.10E+00
116
6.04E+01
4.71E+01
5.42E+01
7.47E+01
6.80E+01
7.53E+01
2.50E+02
2.61E+02
2.38E+02
5.28E+01
4.26E+01
4.21E+01
3.80E+01
4.29E+01
4.56E+01
4.72E+00
4.27E+00
3.45E+00
6.28E+01
6.38E+01
6.52E+01
1.46E+00
1.79E+00
1.75E+00
1.46E+02
1.62E+02
1.67E+02
5.83E+00
5.37E+00
5.53E+00
7.77E+02
7.42E+02
7.40E+02
1.32E+01
9.17E+00
1.20E+01
2.94E+00
2.88E+00
2.75E+00
117
7.21E+01
6.39E+01
7.03E+01
9.73E+01
1.03E+02
1.10E+02
3.90E+02
4.21E+02
3.66E+02
1.16E+02
8.85E+01
8.74E+01
4.77E+01
6.29E+01
5.90E+01
6.66E+00
4.94E+00
6.39E+00
4.32E+02
4.38E+02
4.11E+02
1.67E+01
1.50E+01
1.67E+01
1.10E+03
1.17E+03
1.18E+03
6.89E+01
6.42E+01
6.35E+01
1.05E+03
1.03E+03
1.01E+03
1.33E+01
1.25E+01
1.48E+01
1.35E+00
1.43E+00
1.26E+00
118
7.33E+01
6.02E+01
7.53E+01
4.45E+01
4.50E+01
5.40E+01
2.78E+02
2.88E+02
2.73E+02
8.46E+01
6.64E+01
6.43E+01
5.15E+01
5.38E+01
6.00E+01
3.97E+00
4.04E+00
4.14E+00
4.16E+02
4.30E+02
4.08E+02
1.74E+01
1.59E+01
1.83E+01
1.11E+03
1.22E+03
1.20E+03
8.06E+01
7.53E+01
7.91E+01
7.20E+02
7.03E+02
6.77E+02
9.77E+00
8.32E+00
1.21E+01
3.70EN01
4.62EN01
3.63EN01
119
7.23E+01
6.30E+01
6.68E+01
3.45E+01
3.16E+01
3.57E+01
2.69E+02
2.61E+02
2.40E+02
5.51E+01
4.19E+01
4.05E+01
4.88E+01
4.65E+01
5.78E+01
3.40E+00
4.22E+00
2.07E+00
4.04E+02
4.09E+02
3.83E+02
1.58E+01
1.48E+01
1.66E+01
1.04E+03
1.14E+03
1.17E+03
7.37E+01
6.84E+01
6.54E+01
6.08E+02
5.90E+02
5.81E+02
1.37E+01
1.01E+01
1.44E+01
4.22EN01
2.13EN01
1.90EN01
121
3.74E+01
3.00E+01
3.57E+01
1.00E+01
1.02E+01
1.23E+01
1.27E+02
1.27E+02
1.20E+02
1.59E+01
1.36E+01
1.25E+01
3.10E+01
2.41E+01
2.99E+01
2.97E+00
4.58E+00
9.47EN01
3.07E+02
3.52E+02
3.15E+02
1.36E+01
1.31E+01
1.48E+01
7.73E+02
9.18E+02
9.05E+02
6.42E+01
6.16E+01
6.44E+01
3.02E+02
3.41E+02
2.77E+02
1.37E+01
1.15E+01
1.53E+01
8.60EN02
1.45EN01
1.24EN01
130
1045
1068
1148
1173
1045/1068
1142/1148
204
202/204
187
185/187
317
62
418
Site
1045
1068
1148
1173
1045,)1068
1142,)1148
204
202,)204
187
185,)187
317
62
418
Protein
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
ERK1
ERK1
ERK2
ERK2
Shc
Shp2
Src
Site
Protein
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
ERK1
ERK1
ERK2
ERK2
Shc
Shp2
Src
PRM
MRM
Amount)(fmol)
StdDev)(fmol)
Sequence
0)sec)(113) )30)sec)(114) )1)min)(115) )2)min)(116) )3)min)(117) )5)min)(117) )10)min)(119) )30)min)(121) 0)sec)(113) )30)sec)(114) )1)min)(115) )2)min)(116) )3)min)(117) )5)min)(117) )10)min)(119) )30)min)(121)
ySSDPTGALTEDSIDDTFLPVPEYINQSVPK
1.01E+01
7.14E+01
4.76E+01
5.39E+01
6.88E+01
6.96E+01
6.74E+01
3.43E+01
2.15E+00
9.03E+00
4.80E+00
6.67E+00
4.31E+00
8.21E+00
4.67E+00
3.91E+00
YSSDPTGALTEDSIDDTFLPVPEyINQSVPK
1.02E+01
5.50E+01
4.57E+01
7.26E+01
1.03E+02
4.78E+01
3.39E+01
1.08E+01
7.79ER01
1.00E+01
2.10E+00
4.06E+00
6.40E+00
5.37E+00
2.15E+00
1.25E+00
GSHQISLDNPDyQQDFFPK
6.14E+01
2.97E+02
2.02E+02
2.50E+02
3.92E+02
2.80E+02
2.57E+02
1.24E+02
1.12E+01
1.82E+01
1.09E+01
1.12E+01
2.78E+01
7.59E+00
1.49E+01
4.32E+00
GSTAENAEyLR
1.21E+01
3.97E+01
4.51E+01
4.59E+01
9.71E+01
7.18E+01
4.59E+01
1.40E+01
3.94E+00
2.77E+01
6.35E+00
6.04E+00
1.59E+01
1.11E+01
8.05E+00
1.71E+00
ySSDPTGALTEDSIDDTFLPVPEyINQSVPK
3.31E+01
6.40E+01
4.26E+01
4.22E+01
5.65E+01
5.51E+01
5.10E+01
2.83E+01
1.05E+01
8.06E+00
3.13E+00
3.84E+00
7.88E+00
4.38E+00
6.00E+00
3.74E+00
GSHQIsLDNPDyQQDFFPK
1.02E+01
6.58E+00
3.64E+00
4.14E+00
6.00E+00
4.05E+00
3.23E+00
2.83E+00
2.37E+00
8.22ER01
4.18ER01
6.42ER01
9.24ER01
8.64ER02
1.09E+00
1.82E+00
IADPEHDHTGFLTEyVATR
2.08E+01
2.26E+01
2.00E+01
6.40E+01
4.27E+02
4.18E+02
3.99E+02
3.25E+02
1.93E+00
2.46E+00
2.49E+00
1.20E+00
1.43E+01
1.12E+01
1.39E+01
2.39E+01
IADPEHDHTGFLtEyVATR
2.77E+00
1.84E+00
1.01E+00
1.67E+00
1.61E+01
1.72E+01
1.57E+01
1.38E+01
6.36ER01
5.37ER01
1.04ER01
1.82ER01
9.90ER01
1.19E+00
9.25ER01
9.16ER01
VADPDHDHTGFLTEyVATR
4.42E+01
4.88E+01
4.08E+01
1.58E+02
1.15E+03
1.18E+03
1.12E+03
8.65E+02
5.74E+00
8.72E+00
5.24E+00
1.10E+01
4.48E+01
5.99E+01
6.39E+01
8.03E+01
VADPDHDHTGFLtEyVATR
5.77E+00
4.32E+00
2.60E+00
5.58E+00
6.55E+01
7.83E+01
6.92E+01
6.34E+01
8.85ER01
1.34E+00
1.67ER01
2.36ER01
2.96E+00
2.75E+00
4.20E+00
1.55E+00
ELFDDPSyVNVQNLDK
2.62E+01
8.24E+02
5.82E+02
7.53E+02
1.03E+03
7.00E+02
5.93E+02
3.07E+02
4.80E+00
5.42E+01
1.75E+01
2.11E+01
2.07E+01
2.19E+01
1.37E+01
3.22E+01
IQNTGDyYDLYGGEK
1.57E+01
1.44E+01
1.07E+01
1.15E+01
1.35E+01
1.01E+01
1.27E+01
1.35E+01
9.61ER01
2.11E+00
1.25E+00
2.07E+00
1.16E+00
1.89E+00
2.32E+00
1.91E+00
LIEDNEyTAR
9.29E+01
3.81E+01
7.42E+00
2.86E+00
1.35E+00
3.98ER01
2.75ER01
1.18ER01
7.70E+00
1.56E+00
3.31ER01
1.01ER01
8.36ER02
5.51ER02
1.28ER01
2.98ER02
Amount)(fmol)
StdDev)(fmol)
Sequence
0)sec)(113) )30)sec)(114) )1)min)(115) )2)min)(116) )3)min)(117) )5)min)(117) )10)min)(119) )30)min)(121) 0)sec)(113) )30)sec)(114) )1)min)(115) )2)min)(116) )3)min)(117) )5)min)(117) )10)min)(119) )30)min)(121)
ySSDPTGALTEDSIDDTFLPVPEYINQSVPK
1.55E+01
5.37E+01
6.48E+01
1.34E+02
9.53E+01
6.20E+01
5.46E+01
1.52E+01
2.85E+00
6.73E+00
2.99E+00
3.06E+01
1.75E+00
4.03E+00
3.95E+00
1.24E+00
YSSDPTGALTEDSIDDTFLPVPEyINQSVPK
3.78E+01
1.35E+02
1.17E+02
1.59E+02
1.10E+02
1.57E+02
1.33E+02
6.37E+01
1.57E+01
3.16E+00
1.23E+01
1.05E+01
1.82E+01
4.45E+00
2.92E+01
8.39E+00
GSHQISLDNPDyQQDFFPK
5.05E+01
3.03E+02
3.10E+02
4.57E+02
4.15E+02
4.23E+02
3.69E+02
1.55E+02
1.19E+01
2.49E+01
4.36E+01
4.19E+01
1.90E+01
1.96E+01
1.99E+01
1.59E+01
GSTAENAEyLR
1.33E+01
7.92E+01
7.25E+01
9.85E+01
1.03E+02
1.06E+02
6.74E+01
1.86E+01
4.39ER01
3.00E+00
5.34E+00
6.11E+00
5.01E+00
6.33E+00
4.06E+00
7.66ER01
ySSDPTGALTEDSIDDTFLPVPEyINQSVPK
3.12E+01
1.52E+02
1.55E+02
2.57E+02
1.68E+02
1.28E+02
8.45E+01
2.14E+01
6.42E+00
6.58E+00
3.41E+01
3.65E+01
1.82E+01
2.72E+01
1.69E+01
4.15E+00
GSHQIsLDNPDyQQDFFPK
7.56E+01
3.83E+01
1.98E+01
3.68E+01
1.83E+01
2.65E+01
1.14E+01
4.35E+00
3.61E+01
1.49E+01
5.78E+00
9.42E+00
2.73E+00
4.82E+00
4.65E+00
4.67ER01
IADPEHDHTGFLTEyVATR
5.73E+01
4.45E+01
8.39E+01
2.20E+02
3.85E+02
5.98E+02
5.28E+02
4.41E+02
9.11E+00
6.31E+00
9.52E+00
1.84E+01
4.95E+01
4.41E+01
2.13E+01
6.25E+01
IADPEHDHTGFLtEyVATR
8.30E+00
5.23E+00
5.87E+00
1.04E+01
1.68E+01
2.74E+01
2.26E+01
1.65E+01
7.85ER01
1.10E+00
2.24E+00
3.38E+00
1.71E+00
5.37E+00
3.82ER01
1.69E+00
VADPDHDHTGFLTEyVATR
1.03E+02
8.64E+01
2.04E+02
5.33E+02
1.05E+03
1.60E+03
1.38E+03
1.02E+03
3.70E+00
3.92E+00
9.24E+00
4.15E+01
3.78E+01
4.85E+01
4.20E+01
5.54E+01
VADPDHDHTGFLtEyVATR
1.58E+01
8.65E+00
1.30E+01
1.87E+01
7.04E+01
1.28E+02
1.02E+02
9.02E+01
9.67E+00
8.16ER01
4.24E+00
1.91E+00
1.15E+01
1.55E+01
3.14E+00
6.32E+00
ELFDDPSyVNVQNLDK
4.64E+01
9.71E+02
8.45E+02
1.46E+03
1.05E+03
1.20E+03
8.88E+02
4.26E+02
8.62E+00
6.40E+01
7.26E+01
2.56E+02
1.64E+02
1.11E+02
7.63E+01
4.11E+01
IQNTGDyYDLYGGEK
2.46E+01
2.35E+01
2.00E+01
2.54E+01
1.68E+01
2.17E+01
2.89E+01
2.19E+01
3.12E+00
1.15E+00
2.31E+00
2.59E+00
2.24E+00
1.70E+00
2.81E+00
2.93E+00
LIEDNEyTAR
2.29E+01
3.07E+01
2.99E+01
4.29E+01
3.57E+01
3.37E+01
1.80E+01
1.79E+01
1.17E+00
4.74E+00
1.43E+00
8.19E+00
3.25E+00
4.18E+00
1.32E+00
6.69ER01
Table 3-­‐3: Comparison of MRM and PRM quantification for 13 tyrosine phosphorylated peptides. Absolute quantification values from Appendix Table 3-­‐1 and Appendix Table 3-­‐2 are combined here for easy comparison of the two mass spectrometric platforms. Columns D-­‐K contain the mean amount of the peptide as measured by MRM or PRM; columns L-­‐S contain calculated standard deviations from the technical replicate data in the respective tables. Finally, columns T-­‐AA contain the corresponding coefficient of variation for each peptide, as calculated by standard deviation divided by the mean. Table continued on next page. 131
Site
1045
1068
1148
1173
1045/1068
1142/1148
204
202/204
187
185/187
317
62
418
CV
01sec1(113)
1301sec1(114) 111min1(115)
121min1(116)
131min1(117)
151min1(117)
1101min1(119) 1301min1(121)
2.14EM01
1.26EM01
1.01EM01
1.24EM01
6.27EM02
1.18EM01
6.94EM02
1.14EM01
7.67EM02
1.82EM01
4.60EM02
5.59EM02
6.20EM02
1.12EM01
6.34EM02
1.15EM01
1.83EM01
6.13EM02
5.40EM02
4.49EM02
7.07EM02
2.71EM02
5.80EM02
3.47EM02
3.25EM01
6.98EM01
1.41EM01
1.32EM01
1.64EM01
1.55EM01
1.76EM01
1.22EM01
3.17EM01
1.26EM01
7.34EM02
9.10EM02
1.39EM01
7.95EM02
1.18EM01
1.32EM01
2.32EM01
1.25EM01
1.15EM01
1.55EM01
1.54EM01
2.13EM02
3.36EM01
6.42EM01
9.25EM02
1.09EM01
1.24EM01
1.88EM02
3.34EM02
2.68EM02
3.49EM02
7.35EM02
2.30EM01
2.92EM01
1.03EM01
1.09EM01
6.13EM02
6.95EM02
5.89EM02
6.63EM02
1.30EM01
1.79EM01
1.29EM01
6.97EM02
3.89EM02
5.10EM02
5.72EM02
9.28EM02
1.54EM01
3.10EM01
6.43EM02
4.23EM02
4.52EM02
3.51EM02
6.07EM02
2.45EM02
1.83EM01
6.58EM02
3.01EM02
2.80EM02
2.01EM02
3.13EM02
2.31EM02
1.05EM01
6.10EM02
1.46EM01
1.17EM01
1.80EM01
8.54EM02
1.88EM01
1.83EM01
1.41EM01
8.29EM02
4.08EM02
4.47EM02
3.53EM02
6.20EM02
1.39EM01
4.65EM01
2.52EM01
CV
01sec1(113)
1301sec1(114) 111min1(115)
121min1(116)
131min1(117)
151min1(117)
1101min1(119) 1301min1(121)
1.84EM01
1.25EM01
4.61EM02
2.28EM01
1.84EM02
6.50EM02
7.24EM02
8.14EM02
4.17EM01
2.35EM02
1.05EM01
6.59EM02
1.65EM01
2.83EM02
2.20EM01
1.32EM01
2.35EM01
8.22EM02
1.41EM01
9.16EM02
4.58EM02
4.64EM02
5.37EM02
1.03EM01
3.30EM02
3.79EM02
7.37EM02
6.20EM02
4.87EM02
5.96EM02
6.02EM02
4.13EM02
2.06EM01
4.34EM02
2.19EM01
1.42EM01
1.08EM01
2.13EM01
2.01EM01
1.94EM01
4.78EM01
3.88EM01
2.92EM01
2.56EM01
1.50EM01
1.82EM01
4.08EM01
1.07EM01
1.59EM01
1.42EM01
1.13EM01
8.34EM02
1.29EM01
7.39EM02
4.03EM02
1.42EM01
9.46EM02
2.09EM01
3.81EM01
3.27EM01
1.02EM01
1.96EM01
1.69EM02
1.02EM01
3.58EM02
4.53EM02
4.53EM02
7.80EM02
3.60EM02
3.04EM02
3.04EM02
5.41EM02
6.13EM01
9.44EM02
3.26EM01
1.02EM01
1.63EM01
1.21EM01
3.09EM02
7.01EM02
1.86EM01
6.59EM02
8.59EM02
1.75EM01
1.56EM01
9.24EM02
8.60EM02
9.64EM02
1.27EM01
4.88EM02
1.16EM01
1.02EM01
1.33EM01
7.84EM02
9.73EM02
1.34EM01
5.12EM02
1.54EM01
4.77EM02
1.91EM01
9.10EM02
1.24EM01
7.34EM02
3.74EM02
MRM
Protein
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
ERK1
ERK1
ERK2
ERK2
Shc
Shp2
Src
PRM
Site
Sequence
1045
ySSDPTGALTEDSIDDTFLPVPEYINQSVPK
1068
YSSDPTGALTEDSIDDTFLPVPEyINQSVPK
1148
GSHQISLDNPDyQQDFFPK
1173
GSTAENAEyLR
1045,11068 ySSDPTGALTEDSIDDTFLPVPEyINQSVPK
1142,11148 GSHQIsLDNPDyQQDFFPK
204
IADPEHDHTGFLTEyVATR
202,1204 IADPEHDHTGFLtEyVATR
187
VADPDHDHTGFLTEyVATR
185,1187 VADPDHDHTGFLtEyVATR
317
ELFDDPSyVNVQNLDK
62
IQNTGDyYDLYGGEK
418
LIEDNEyTAR
Sequence
ySSDPTGALTEDSIDDTFLPVPEYINQSVPK
YSSDPTGALTEDSIDDTFLPVPEyINQSVPK
GSHQISLDNPDyQQDFFPK
GSTAENAEyLR
ySSDPTGALTEDSIDDTFLPVPEyINQSVPK
GSHQIsLDNPDyQQDFFPK
IADPEHDHTGFLTEyVATR
IADPEHDHTGFLtEyVATR
VADPDHDHTGFLTEyVATR
VADPDHDHTGFLtEyVATR
ELFDDPSyVNVQNLDK
IQNTGDyYDLYGGEK
LIEDNEyTAR
Protein
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
ERK1
ERK1
ERK2
ERK2
Shc
Shp2
Src
132
Protein
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
Site
1045
1045
1045
1045
1045
1045
1045
1045
1045
1068
1068
1068
1068
1068
1068
1068
1068
1068
1148
1148
1148
1148
1148
1148
1148
1148
1148
1173
1173
1173
1173
1173
1173
1173
1173
1173
Sequence
ySSDPTGALTEDSIDDTFLPVPEYINQSVPK
ySSDPTGALTEDSIDDTFLPVPEYINQSVPK
ySSDPTGALTEDSIDDTFLPVPEYINQSVPK
ySSDPTGALTEDSIDDTFLPVPEYINQSVPK
ySSDPTGALTEDSIDDTFLPVPEYINQSVPK
ySSDPTGALTEDSIDDTFLPVPEYINQSVPK
ySSDPTGALTEDSIDDTFLPVPEYINQSVPK
ySSDPTGALTEDSIDDTFLPVPEYINQSVPK
ySSDPTGALTEDSIDDTFLPVPEYINQSVPK
YSSDPTGALTEDSIDDTFLPVPEyINQSVPK
YSSDPTGALTEDSIDDTFLPVPEyINQSVPK
YSSDPTGALTEDSIDDTFLPVPEyINQSVPK
YSSDPTGALTEDSIDDTFLPVPEyINQSVPK
YSSDPTGALTEDSIDDTFLPVPEyINQSVPK
YSSDPTGALTEDSIDDTFLPVPEyINQSVPK
YSSDPTGALTEDSIDDTFLPVPEyINQSVPK
YSSDPTGALTEDSIDDTFLPVPEyINQSVPK
YSSDPTGALTEDSIDDTFLPVPEyINQSVPK
GSHQISLDNPDyQQDFFPK
GSHQISLDNPDyQQDFFPK
GSHQISLDNPDyQQDFFPK
GSHQISLDNPDyQQDFFPK
GSHQISLDNPDyQQDFFPK
GSHQISLDNPDyQQDFFPK
GSHQISLDNPDyQQDFFPK
GSHQISLDNPDyQQDFFPK
GSHQISLDNPDyQQDFFPK
GSTAENAEyLR
GSTAENAEyLR
GSTAENAEyLR
GSTAENAEyLR
GSTAENAEyLR
GSTAENAEyLR
GSTAENAEyLR
GSTAENAEyLR
GSTAENAEyLR
Run
TGFa20_2F
TGFa20_2R
AREG20_2F
AREG20_2R
AREG20_2R_reip
AREG20_3R
EGF20_2F
EGF20_2R
T0_R1Q4
TGFa20_2F
TGFa20_2R
AREG20_2F
AREG20_2R
AREG20_2R_reip
AREG20_3R
EGF20_2F
EGF20_2R
T0_R1Q4
TGFa20_2F
TGFa20_2R
AREG20_2F
AREG20_2R
AREG20_2R_reip
AREG20_3R
EGF20_2F
EGF20_2R
T0_R1Q4
TGFa20_2F
TGFa20_2R
AREG20_2F
AREG20_2R
AREG20_2R_reip
AREG20_3R
EGF20_2F
EGF20_2R
T0_R1Q4
1.65
1.65
1.65
1.65
4.2185
4.2185
4.2185
1.65
1650
1.65
1650
4218.5
4.2185
4218.5
1.2946
1294.6
1.2946
1294.6
1650
1.65
1650
1.65
1285
1.285
1285
1.285
22464330.4
405339.84
31953061.4
60758.5568
1152088.04
60782802.9
915936.343
3105999.74
31988.7518
2892876.27
7252.3719
283240.689
26490254.3
330465.468
1.2946
1.2946
1.2946
1.2946
1.65
1.65
1.65
1.65
1.285
1.285
1.285
1586230.04
540101.454
2355083.42
75803.2042
306171.327
3781062.7
224775.961
240101.17
70002.2192
257964.225
25370.9232
592955.125
1782047.6
616660.437
2078197.98
16623.823
2109259.21
8780.8869
36013.4135
2461297.79
24373.5242
1.285
361623.11
3201285.31
599237.386
483084.345
2028782.52
711848.114
1545129.73
75063.4294
368897.128
84694.6675
115117.267
3798736.53
504810.12
3718268.33
181517.614
56211.4739
193887.003
19768.6516
75616.4282
191106.393
52151.4385
24935.485 73783.9356 404404.545
71140.2585
31109240.4
144575.651
4169049.66
27822309.9
278003.421
22833581.5
17801.0882
2813968.15
28893.7979
895075.422
34552290.8
117485.42
33839545.9
57705.3387
254331.422
76422.9384
80957.9877
422733.041
66758.1158
334295.231
6589826.3
387698.472
936008.924
859785.58
5411.0798
463156.302
1088050.9
256045.716
88255.557 137703.777 70433.5006 124446.646
386200.326
1438820.99
1544939.78
8439.8324
1120075.07
2012536.23
303642.879
85996.5019
164473.725
111574.901
3349.7328
778650.031
592986.114
518026.033
13585.1209
1943578.19
16171.6744
748871.186
3247435.33
21750.9756
3159991.04
1611514.11 1542548.18 1850207.45 934742.224 42268843.3 2823966.17 430266.16 1799344.88
2013990.43 2113633.95 2602489.5 1310842.89 218722.477 655669.423 4027663.94 60320821.1
30135.7616 9210.3221 11046.2875 14032.5401 5213290.05 1009423.64 21848.139 220876.265
400774.058
1198508.86
1229227.1
15658.4164
971612.58
2023703.02
579138.706
330102.83
1228138.53
1270574.07
76005.778
1125089.38
2327172.92
316532.167
82849.3843
213448.013
197912.572
1187.6487
1353992.99
920842.677
591211.303
20942.1133
13890.3321
22478.9089
153.1698
127742.722
85541.2816
88019.0604
4.2185
4218.5
4.2185
4218.5
1285
1.285
1285
82101.9474
157396.218
135354.13
1032.4934
1377923.03
891269.564
988889.838
57761.4405
135905.874
128739.531
5152.1284
1506375.58
1008633.94
649898.567
20241.9845
41931.7371
40834.3373
Q61.9151
187296.029
137645.685
86457.6259
4.2185
4.2185
4.2185
4.2185
1.285
1.285
1.285
28759.5877
61808.5016
55470.8385
1376.4434
170006.982
111717.691
102929.821
19008.5293
72021.1728
59656.0461
2398.6719
165740.555
126192.911
76911.7599
128.5
12.85
128.5
12.85
42.185
421.85
42.185
421.85
128.5
12.85
128.5
16.5
165
16.5
165
421.85
42.185
421.85
12.946
129.46
12.946
129.46
165
16.5
165
16.5
Endogenous)iTRAQ)Integrated)Intensity
Standard)iTRAQ)Integrated)Intensity
Standard)Amount
114
115
116
117
114
115
116
117 AAA)Multiple
114
115
66580.6501 61353.6538 37742.9585 19082.5742 1629552.24 184678.584 48031.3687 9210.4672
1.2946
1294.6
129.46
70478.9727 79230.9432 38445.7845 27398.4954 20349.0189 65520.9974 176406.851 2087415.73
1.2946
1.2946
12.946
15402.5749 37933.0061 22469.8616 9686.7124 2120167.23 267327.631 50846.3534 13907.3872
1.2946
1294.6
129.46
1.2946
1.2946
12.946
12.85
128.5
12.85
128.5
421.85
42.185
421.85
42.185
12.85
128.5
12.85
165
16.5
165
16.5
42.185
421.85
42.185
129.46
12.946
129.46
12.946
16.5
165
16.5
165
116
12.946
129.46
12.946
129.46
1.285
1285
1.285
1285
4218.5
4.2185
4218.5
4.2185
1.285
1285
1.285
1650
1.65
1650
1.65
4.2185
4218.5
4.2185
1294.6
1.2946
1294.6
1.2946
1.65
1650
1.65
1650
117
1.2946
1294.6
1.2946
1294.6
12.093
7.1347
12.593
0.22236
50.5747
58.1257
39.0445
45.5766
74.9692
59.1986
4.779
79.3376
103.1597
53.954
75.8023
110.3609
115.8088
4.5273
23.9989
27.2594
17.2672
28.9826
34.123
31.4931
6.1323
43.9087
97.2924
105.007
1.6944
137.96
160.1957
61.5765
75.5094
169.6454
208.0956
7.0614
58.0378
50.4209
20.4771
16.4034
67.5422
62.525
4.8273
58.8287
48.3863
13.1695
20.554
64.5413
144.8047
171.14
63.5925
58.2976
58.3036
21.3463
30.6125
61.9476
68.3057
7.3505
153.4864
175.4684
67.6889
56.311
50.7802
4.0249
32.07
78.3588
141.3112
165.5707
13.1011
50.3451
50.7007
39.056
43.5125
71.7432
71.8152
1.473
140.3982
155.051
102.9959
16.6072
31.7476
31.0757
1.9983
67.3075
75.9127
45.6588
11.6887
21.538
22.8761
Q0.089907
74.1525
93.5309
38.3518
10.9765
36.9932
33.4203
3.4824
65.6184
85.7486
34.1174
117
12.1669
16.3904
4.5719
Calculated)Endogenous)Amount
114
115
116
42.4513
39.1186
24.0646
42.1623
47.3979
22.9992
7.2696
17.9033
10.6052
Table 3-­‐4: Ligand-­‐Specific EGFR Absolute Quantification. Integrated intensity for each iTRAQ reporter generated from endogenous transitions for key EGFR C-­‐terminal phosphorylation sites (columns E-­‐H). Integrated iTRAQ reporter ion intensity from standard peptide transitions (columns I-­‐L). Standard peptide concentration was determined by amino acid analysis and a correction factor was applied to the nominal doses included (column M). Actual standard dose included into each channel following AAA correction (columns N-­‐Q). The standard curve built from the standard amount (columns N-­‐Q) and the standard intensities (columns I-­‐L) was used to calculate the endogenous amount of the corresponding peptide in each sample (columns R-­‐U) based on the endogenous intensities (columns E-­‐
H). Prior to MS analysis, the number of cells contained in the sample was calculated with a BCA assay; standard cell number stock lysate solutions were used to calibrate the unknown samples (columns V-­‐Y). Total amount of endogenous peptide (columns R-­‐
U) was converted to moles and divided by the corresponding cell number (columns V-­‐
Y) to produce molecules/cell (columns Z-­‐AC). The mean and range of two biological replicates is reported (columns AG-­‐AK) for each ligand. To mitigate contamination from standard peptide iTRAQ signals, several precautions were taken. First, standard doses and iTRAQ channels were varied between replicates. Second, four time-­‐zero replicates were analyzed separately from other samples, because this condition was anticipated to have low signal. In addition, the biological replicate in the channel containing the 1 pmol standard dose were excluded, because significant residual signal from the standard peptide was visible. Table continued on next page. 133
Protein
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
EGFR
Site
1045
1045
1045
1045
1045
1045
1045
1045
1045
1068
1068
1068
1068
1068
1068
1068
1068
1068
1148
1148
1148
1148
1148
1148
1148
1148
1148
1173
1173
1173
1173
1173
1173
1173
1173
1173
Sequence
ySSDPTGALTEDSIDDTFLPVPEYINQSVPK
ySSDPTGALTEDSIDDTFLPVPEYINQSVPK
ySSDPTGALTEDSIDDTFLPVPEYINQSVPK
ySSDPTGALTEDSIDDTFLPVPEYINQSVPK
ySSDPTGALTEDSIDDTFLPVPEYINQSVPK
ySSDPTGALTEDSIDDTFLPVPEYINQSVPK
ySSDPTGALTEDSIDDTFLPVPEYINQSVPK
ySSDPTGALTEDSIDDTFLPVPEYINQSVPK
ySSDPTGALTEDSIDDTFLPVPEYINQSVPK
YSSDPTGALTEDSIDDTFLPVPEyINQSVPK
YSSDPTGALTEDSIDDTFLPVPEyINQSVPK
YSSDPTGALTEDSIDDTFLPVPEyINQSVPK
YSSDPTGALTEDSIDDTFLPVPEyINQSVPK
YSSDPTGALTEDSIDDTFLPVPEyINQSVPK
YSSDPTGALTEDSIDDTFLPVPEyINQSVPK
YSSDPTGALTEDSIDDTFLPVPEyINQSVPK
YSSDPTGALTEDSIDDTFLPVPEyINQSVPK
YSSDPTGALTEDSIDDTFLPVPEyINQSVPK
GSHQISLDNPDyQQDFFPK
GSHQISLDNPDyQQDFFPK
GSHQISLDNPDyQQDFFPK
GSHQISLDNPDyQQDFFPK
GSHQISLDNPDyQQDFFPK
GSHQISLDNPDyQQDFFPK
GSHQISLDNPDyQQDFFPK
GSHQISLDNPDyQQDFFPK
GSHQISLDNPDyQQDFFPK
GSTAENAEyLR
GSTAENAEyLR
GSTAENAEyLR
GSTAENAEyLR
GSTAENAEyLR
GSTAENAEyLR
GSTAENAEyLR
GSTAENAEyLR
GSTAENAEyLR
Run
TGFa20_2F
TGFa20_2R
AREG20_2F
AREG20_2R
AREG20_2R_reip
AREG20_3R
EGF20_2F
EGF20_2R
T0_R1V4
TGFa20_2F
TGFa20_2R
AREG20_2F
AREG20_2R
AREG20_2R_reip
AREG20_3R
EGF20_2F
EGF20_2R
T0_R1V4
TGFa20_2F
TGFa20_2R
AREG20_2F
AREG20_2R
AREG20_2R_reip
AREG20_3R
EGF20_2F
EGF20_2R
T0_R1V4
TGFa20_2F
TGFa20_2R
AREG20_2F
AREG20_2R
AREG20_2R_reip
AREG20_3R
EGF20_2F
EGF20_2R
T0_R1V4
Cell$#
114
2430607.13
1941190.6
2768161.54
1913155.23
1913155.23
2047901.77
2242008.58
2377156.53
1751948.56
2430607.13
1941190.6
2768161.54
1913155.23
1913155.23
2047901.77
2242008.58
2377156.53
1751948.56
2430607.13
1941190.6
2768161.54
1913155.23
1913155.23
2047901.77
2242008.58
2377156.53
1751948.56
2430607.13
1941190.6
2768161.54
1913155.23
1913155.23
2047901.77
2242008.58
2377156.53
1751948.56
115
2558364.79
2351055.99
2817372.54
2275446.07
2275446.07
2210061.58
2539886.44
2695250.91
1815076.03
2558364.79
2351055.99
2817372.54
2275446.07
2275446.07
2210061.58
2539886.44
2695250.91
1815076.03
2558364.79
2351055.99
2817372.54
2275446.07
2275446.07
2210061.58
2539886.44
2695250.91
1815076.03
2558364.79
2351055.99
2817372.54
2275446.07
2275446.07
2210061.58
2539886.44
2695250.91
1815076.03
116
2832764.7
2698966.46
2487789.83
2658963.53
2658963.53
2435380.15
2602290.09
2644552.07
1706013.03
2832764.7
2698966.46
2487789.83
2658963.53
2658963.53
2435380.15
2602290.09
2644552.07
1706013.03
2832764.7
2698966.46
2487789.83
2658963.53
2658963.53
2435380.15
2602290.09
2644552.07
1706013.03
2832764.7
2698966.46
2487789.83
2658963.53
2658963.53
2435380.15
2602290.09
2644552.07
1706013.03
117
2768050.32
2604745.79
2380362.89
2348436.39
2348436.39
2059384.46
2623107.06
2578791.28
1751444.42
2768050.32
2604745.79
2380362.89
2348436.39
2348436.39
2059384.46
2623107.06
2578791.28
1751444.42
2768050.32
2604745.79
2380362.89
2348436.39
2348436.39
2059384.46
2623107.06
2578791.28
1751444.42
2768050.32
2604745.79
2380362.89
2348436.39
2348436.39
2059384.46
2623107.06
2578791.28
1751444.42
21344.2006
33493.3645
36981.1808
4345.1966
11846.532
12982.1755
8345.26201
11852.396
17004.4635
16040.3435
488.544824
33036.6165
39701.6074
22007.573
4523.64518
7524.7676
6940.93871
662.769261
15837.8958
19437.8379
9756.11043
3535.03008
1637.4053
2939.74393
76.4287569
10999.0664
13433.8144
9874.45615
13322.968
17205.3437
13819.481
1642.62021
17254.4678
23841.9195
13645.1077
22158.5554
25327.6974
27034.719
1556.10681
5219.31906
6300.09994
4366.92003
3713.7955 7429.42211
18665.1184
39244.8679
47370.4234
2491.75284
12333.8008
11246.2983
4955.08665
10853.7624
22507.1083
23903.5619
597.902117
29318.3264
35731.3856
14900.3957
2889.32199
4982.48679
5207.46494
V31.72544
15758.3879
20861.9124
9280.43972
Molecules/cell
114
115
116
117
10514.1149 9204.86294 5114.04608 2646.07682
13075.3284 12136.4765 5129.93348 3788.09358
1580.94069 3825.47443 2566.26597 1156.24547
3226.64548
9933.01567
8463.48161
1196.61321
16252.0205
26592.2661
7419.60848
8998.83249
16633.5024
17297.9906
2525.75965
38014.7049
54416.0767
14720.4985
18972.5226
38881.3986
43340.1329
21851.4891
14438.8432
18081.0515
4642.24087
6467.59227 8484.55176
15796.049 13346.7471 15624.8546 7831.18857
12253.5274 11342.0536 14233.0531 7351.83431
4525.26927 1334.92468 1703.40704 2107.77148
Mean
Range
Mean
Range
Mean
Range
2797.68542
1394.54489
2797.68542
1394.54489
2797.68542
1394.54489
909.689049
577.037692
909.689049
577.037692
909.689049
577.037692
Final$Quantification$(molecules/cell)
0
TGFα
235.824193
347.24735
AREG
235.824193
347.24735
235.824193
347.24735
16259.9473
1821.10413
4642.24087
912.675703
14024.7882
1771.26081
46215.3908
8200.68588
16846.5106
2126.01203
41110.7657
2229.36714
21422.1433
5170.12284
8209.22049
789.612005
16965.7465
332.244099
1
11794.7216
1280.60675
2403.79308
822.852398
9198.24864
734.767031
12414.3537
567.821746
8414.90689
69.6448773
12344.4004
1002.34674
36369.112
3332.49543
21675.8868
331.686214
35237.2727
1743.90814
17637.8669
1799.97105
10804.2532
1048.14278
16522.4035
482.060041
5
10670.6697
1465.80678
4174.5598
349.085375
7232.85315
291.914447
11790.0495
543.751286
4334.44108
620.645574
14928.9539
695.900782
32524.856
3206.52963
16782.7571
1882.36135
43307.6456
4062.77774
18310.1502
2551.76223
10067.101
786.661327
23205.3351
698.226793
10
5121.98978
7.94370005
2727.79398
161.528007
5094.97587
112.489076
5759.7095
540.390439
5898.17107
1531.25104
7591.51144
239.67713
20548.1936
3293.72582
17901.8316
4256.72386
26181.2082
853.5108
12216.4404
1217.37395
11598.7121
1724.25592
15512.4124
1692.93131
30
3217.0852
571.008376
2345.63777
1189.39231
2288.57461
651.169313
TGFα
AREG
EGF
TGFα
AREG
EGF
TGFα
AREG
Mean
Range
Mean
Range
Mean
Range
Mean
Range
Mean
Range
Mean
Range
1715.36773
386.423397
1715.36773
386.423397
1715.36773
386.423397
EGF
Mean
Range
Mean
Range
Mean
Range
EGF
134
3.7 Bibliography 1.
Mann, M., Kulak, N. A., Nagaraj, N. & Cox, J. The Coming Age of Complete,
Accurate, and Ubiquitous Proteomes. Molecular Cell 49, 583–590 (2013).
2.
Liang, S. et al. Quantitative proteomics for cancer biomarker discovery. Comb.
Chem. High Throughput Screen. 15, 221–231 (2012).
3.
Wasinger, V. C., Zeng, M. & Yau, Y. Current Status and Advances in Quantitative
Proteomic Mass Spectrometry. International Journal of Proteomics 2013, 1–12
(2013).
4.
Zhang, X., Gureasko, J., Shen, K., Cole, P. A. & Kuriyan, J. An Allosteric
Mechanism for Activation of the Kinase Domain of Epidermal Growth Factor
Receptor. Cell 125, 1137–1149 (2006).
5.
Naegle, K. M., Welsch, R. E., Yaffe, M. B., White, F. M. & Lauffenburger, D. A.
MCAM: Multiple Clustering Analysis Methodology for Deriving Hypotheses and
Insights from High-Throughput Proteomic Datasets. PLoS Comput Biol 7,
e1002119 (2011).
6.
Wolf-Yadlin, A. et al. Effects of HER2 overexpression on cell signaling networks
governing proliferation and migration. Mol Syst Biol 2, (2006).
7.
Gerber, S. A., Rush, J., Stemman, O., Kirschner, M. W. & Gygi, S. P. Absolute
quantification of proteins and phosphoproteins from cell lysates by tandem MS.
Proc. Natl. Acad. Sci. U.S.A. 100, 6940–6945 (2003).
8.
Zhang, Y. & White, F. M. Quantitative Proteomic Analysisof PhosphotyrosineMediated Cellular Signaling Networks. Methods in Molecular Biology 1–10
(2007).
9.
Voldborg, B. R., Damstrup, L., Spang-Thomsen, M. & Poulsen, H. S. Epidermal
growth factor receptor (EGFR) and EGFR mutations, function and possible role in
clinical trials. 1–10 (1997).
10.
Lee, H. C. et al. Activation of Epidermal Growth Factor Receptor and Its
Downstream Signaling Pathway by Nitric Oxide in Response to Ionizing
Radiation. Mol. Cancer Res. 6, 996–1002 (2008).
11.
Nitta, M. et al. Targeting EGFR Induced Oxidative Stress by PARP1 Inhibition in
Glioblastoma Therapy. PLoS ONE 5, e10767 (2010).
135
12.
Roskoski, R. ERK1/2 MAP kinases: structure, function, and regulation.
Pharmacol. Res. 66, 105–143 (2012).
13.
Oksvold, M. P. et al. Serine mutations that abrogate ligand-induced ubiquitination
and internalization of the EGF receptor do not affect c-Cbl association with the
receptor. Oncogene 22, 8509–8518 (2003).
14.
Roepstorff, K. et al. Differential Effects of EGFR Ligands on Endocytic Sorting of
the Receptor. Traffic 10, 1115–1127 (2009).
15.
Chung, E., Graves-Deal, R., Franklin, J. L. & Coffey, R. J. Differential effects of
amphiregulin and TGF-α on the morphology of MDCK cells. Experimental Cell
Research 309, 149–160 (2005).
16.
Wilson, K. J. et al. EGFR ligands exhibit functional differences in models of
paracrine and autocrine signaling. Growth Factors 30, 107–116 (2012).
17.
Alvarado, D., Klein, D. E. & Lemmon, M. A. Structural Basis for Negative
Cooperativity in Growth Factor Binding to an EGF Receptor. Cell 142, 568–579
(2010).
18.
Guo, L. et al. Studies of ligand-induced site-specific phosphorylation of epidermal
growth factor receptor. J. Am. Soc. Mass Spectrom. 14, 1022–1031 (2003).
19.
Jones, R. B., Gordus, A., Krall, J. A. & MacBeath, G. A quantitative protein
interaction network for the ErbB receptors using protein microarrays. Nature 439,
168–174 (2005).
20.
Tinti, M. et al. The SH2 domain interaction landscape. Cell Rep 3, 1293–1305
(2013).
21.
Arneja, A., Johnson, H., Gabrovsek, L., Lauffenburger, D. A. & White, F. M.
Qualitatively Different T Cell Phenotypic Responses to IL-2 versus IL-15 Are
Unified by Identical Dependences on Receptor Signal Strength and Duration. The
Journal of Immunology 192, 123–135 (2013).
136
4
EGFR Signaling Pathway Downstream of Ligands 4.1 Introduction EGFR is of particular interest to cancer researchers, because it sits perched atop a
regulatory network that ultimately controls numerous phenotypes often co-opted in
cancer. With such influence over key cellular phenotypes, it is not surprising that many
cancers (notably breast and lung) harbor overexpressions or mutations of EGFR or one of
its cognate family members (typically HER2)1,2. EGFR is the canonical member of the
receptor tyrosine kinase (RTK) class, but the EGFR family has since grown to include
three other family members (HER2-4), which collectively recognize a variety of small
protein ligands3. Each of these ligands binds their cognate receptor in a similar fashion,
yet induce different phenotypic responses from cells4,5.
In the case of HER-family members, ligand binding stabilizes an “open” configuration of
the extracellular domain facilitating dimerization mediated by interactions between the
participating receptors6,7. Close proximity of two kinase domains permits transphosphorylation of key tyrosine residues6,7, which complete binding motifs for proteins
containing SH2- or PTB-domains5,8. The first class of proteins known to bind
phosphorylated EGFR-family members has no intrinsic enzymatic activity. Instead they
consist of various binding domains (SH2, PTB, WW, PH, etc.) as well as their own sites
of phosphorylation allowing them to serve as adaptors that provide indirect receptor
binding9-13. The second class of proteins is intracellular kinases, which are also recruited
through binding domains. Once recruited into the proximity of their target proteins they
catalyze their own set of phosphorylation events. The third class is intracellular
phosphatases, which are responsible for reversing phosphorylation6.
Together, the balance and abundance of these recruited proteins help to decode and
propagate signals that will ultimately lead to phenotypic changes in the cell. The exact
mechanisms for this decoding is yet unknown. One hypothesis is that ligands must elicit
structural changes that affect the phosphorylation efficiency of particular sites leading to
a bias in recruited proteins. Some evidence exists for ligand-specific structural
differences in receptor dimers; Sheck et. al report alterations in the juxtamembrane
portion of EGFR dimers between TGFα- and EGF-treated conditions14. In addition,
137
Alvarado et. al show that ligand affinity biases the drosophila EGFR-cognate towards
singly- or doubly-bound receptor dimers alter its structure15, with the implicit assumption
that these differences will contribute to altered downstream signaling. In this work, the
authors also mention that human EGFR dimer asymmetries have not been observed15.
Site-specific phosphorylation is a natural modulation mechanism to explain the
differences in information flow across the cellular membrane. Previous studies have
shown ligand-specific changes in phosphorylation following stimulation, which were
used to explain differential recruitment of SH2- and PTB-domains4. Roepstorff et al.
argue that increased Y1045 phosphorylation drives increased c-Cbl recruitment following
EGF stimulation leading to elevated rates of receptor degradation. At a molecular level,
the parameters that are available for modulation are: rate of phosphorylation, affinity for
the completed motif, and protein abundance. At low motif abundance, high affinity/high
abundance adaptor proteins are more capable of binding. It is not until the motif
abundance increases that the low affinity/low abundance binding proteins will be able
dock in appreciably quantities. This dynamic will play out at each docking site on the
receptor as well as secondary sites on recruited proteins.
To better understand how these signals are conducted in the context of EGFR signaling,
we have chosen to focus on ligands known to bind specifically to EGFR and not to other
HER-family members. In this case, receptor phosphorylation will be limited to EGFR
allowing levels to be modulated solely by ligand dose and identity. This leads to two
hypotheses about patterns of receptor phosphorylation. The first is a pattern modulation
scheme, where the ligand identity and dose information is encoded in a specific pattern of
phosphorylation levels at sites on the receptor. This “barcode” is then read by the
appropriate subset of binding proteins initiating the proper downstream programs. The
second is an amplitude modulation scheme where the relative level of specific
phosphorylation sites is roughly constant, but the overall magnitude is scaled to convey
information. Under this scheme a level of control is lost as all sites move together. In
order to decide between these mechanisms and make quantitative predictions about
downstream phenotypes, we have employed site-specific absolute quantification [cite:
MARQUIS] to investigate phosphorylation levels at specific sites on EGFR, a collection
138
of adaptor proteins, as well as a selection of downstream activation signals including the
ERKs.
4.2 Methods 4.2.1
Migration Assay: Cells were serum starved for 24 hours prior to trypsinization and rinsing with SFM. 50µl
of SFM containing the desired inhibitor concentration was added to the upper reservoir of
the migration chamber.1e5 cells in 50µl of SFM were added to the upper reservoir of the
migration chambers. 100µl of SFM containing the desired dose of inhibitor was added to
the bottom reservoir. A further 100µl of Assay Media (SFM + 2x Insulin to 0.4x serum)16
supplemented with desired dose of ligand was added to the lower reservoir of each
migration chamber prior to 24 hours incubation. Lower reservoir media was removed and
replaced with 95% ethanol for 10 minutes. Upper reservoir media was aspirated halfway
through the ethanol fixation. Membranes were washed with PBS, stained with crystal
violet for 15 minutes, and washed with MilliQ water. Residual cell material in the upper
chamber was gently removed with lint-free paper and membranes were allowed to dry.
The intensity of each membrane was quantified by ImageJ.
4.2.2
Proliferation Assay: 1e4 cells in 100µl were plated into each well of a 96-well plate. After 24 hours, they were
serum-starved with 100µl of SFM for an additional 24 hours. Media was exchanged for
SFM containing BRDU and desired concentration of ligand and allowed to incubate for
24 hours. BRDU incorporation was quantified following manufacturer instructions
(Roche, 11647229001).
4.2.3
Mass Spectrometry Mass Spectrometry Sample Preparation: MCF10A cells were serum-starved 24 hours
prior to stimulation with indicated dose of EGFR ligand (purchased from Peprotech) for
1, 5, 10, 30, 120 or 480 minutes. Media was aspirated and cells were lysed with 1ml of
cold 8M urea with 1mM activated sodium orthovanadate. Lysate was frozen at -80 until
further processing. Protein yield was quantified by BCA assay (Pierce) in order to ensure
equal loading. Two technical replicates (250µl) from each of two biological replicates
were prepared. Standard peptides were added to samples as in Supplementary Table XX.
139
Samples were reduced with 10µl 10mM DTT in 100 mM ammonium acetate pH 8.9 for
one hour at 56 C. Samples were alkylated with 75µl of 55mM iodoacetamide in 100 mM
ammonium acetate pH 8.9 for 1 hour at room temperature. 1mL of 100 mM ammonium
acetate and 10µg of sequencing grade trypsin was added prior to 16 hours incubation at
room temperature. Lysates were acidified with 125µl of trifluoroacetic acid (TFA) and
protein was purified with C18 spin columns (ProteaBio, SP-150). Samples were
lyophilized and subsequently labeled with iTRAQ 8plex (AbSciex) per manufacturer’s
directions.
iTRAQ
Channel
Sample
Standards (nominal)
113
Condition 1, BioRep 1, TechRep 1
3 pmol
114
Condition 2, BioRep 1, TechRep 1
1 pmol
115
Condition 1, BioRep 2, TechRep 1
300 fmol
116
Condition 2, BioRep 2, TechRep 1
100 fmol
117
Condition 1, BioRep 1, TechRep 2
30 fmol
118
Condition 2, BioRep 1, TechRep 2
10 fmol
119
Condition 1, BioRep 2, TechRep 2
3 fmol
Table 4-­‐1: iTRAQ labeling scheme. Two biological and two technical replicates of each sample were included in each analysis along with the indicated nominal doses of heavy-­‐labeled standard peptide. Standard Peptide Preparation and Quality Assurance: Standard peptides were
synthesized at the Koch Institute Biopolymers facility and amino acid analysis (AAA)
was performed to measure concentration by AAA Service Laboratories (Damascus, OR).
Synthetic peptides were analyzed by LC MS/MS to confirm sequence and purity.
Immunoprecipitation: 70ul protein G agarose beads (calbiochem IP08) were rinsed in
400ul IP Buffer (100mM tris, 0.3% NP-40, pH 7.4) prior to an 8 hour incubation with a
cocktail of three phosphotyrosine-specific antibodies (12µg 4G10 (Millipore), 12ug PT66
(Sigma), and 12ugPY100 (CST)) in 200ml IP Buffer. Antibody cocktail was removed
and beads were rinsed with 400ul of IP Buffer. Labeled samples were resuspended in
140
150ul iTRAQ IP Buffer (100mM Tris, 1% NP-40, pH 7.4) + 300ul milliQ water and pH
was adjusted to 7.4 (with 0.5M Tris HCl pH 8.5) prior to addition to prepared beads and
16 hour incubation. Supernatant was removed and beads were rinsed 3 times with 400ul
Rinse Buffer (100mM Tris HCl, pH 7.4). Peptides were eluted in 70ul of Elution Buffer
(100mM glycine, pH 2) for 30 minutes at room temperature.
Immobilized Metal Affinity Chromatography (IMAC) Purification: A fused silica
capillary (FSC) column (200µm ID x 10cm length) was packed with POROS 20MC
beads (Applied Biosystems cat.no. 1-5429-06). Column was rinsed with 100 mM EDTA
pH 8.9 in ultrapure water to remove residual iron and then rinsed with ultrapure water to
remove EDTA before fresh iron was deposited by rinsing with 100 mM FeCl3 in
ultrapure water. Excess iron was removed with 0.1% acetic acid prior to loading the IP
elution. The column was washed with 25% MeCN, 1% HOAc, 100 mM NaCl in
ultrapure water to remove non-specifically bound peptides and then rinsed with 0.1%
HOAc. Peptides were eluted with 50µl 250mM NaH2PO4 in ultrapure water and collected
in an autosampler vial. Eluent was acidified with 2µl of 10% TFA prior to loading.
Liquid Chromatography Mass Spectrometry: Acidified eluent was loaded onto an
Acclaim PepMap 100 precolumn (Thermo Scientific) using an EASY-nLC 1000 (Thermo
Scientific). Peptides were analyzed on a one hour gradient from 100% A (0.1% Formic
Acid) to 100% B (0.1% Formic Acid, 80% Acetonitrile) spraying through a 50cm
analytical column (Manufacturer, PN) packed with 3µm beads (Manufacturer, PN).
Absolute Quantification (Full Scan): Analyses were performed on a Thermo QExactive. Endogenous and heavy-labeled versions of each peptide were targeted for MS2
analysis during their expected elution times. In the MS1 scan, the total amount of
endogenous peptide was calculated based on the fraction of MS1 integrated intensity
when compared to the standard peptide. This amount was then apportioned across the
contributing samples based on relative iTRAQ intensity. A single point measurement
where the MS2 scan used to determine the iTRAQ intensity ratios was chosen based on
maximal MS1 intensity to assure maximal signal to noise ratio.
141
4.2.3.1 Partial Least Squares Regression (PLSR) PLSR was performed using the plsregress function provided in the MATLAB software
package.
4.3 Results 4.3.1
MCF10A 24-­‐hour Phenotypes To understand the different proliferation-inducing capacities of three EGFR ligands
(TGFα, AREG, and EGF), MCF10A cells were subjected to 24-hour BRDU
incorporation (Figure 4-­‐1a). A wide range of ligand doses were used to expore the full
range of responses possible for the system, but saturation was observed for all ligands at
5nM. At doses below this threshold, EGF shows a slight increase over TGFα, but AREG
lags significantly behind. Moving forward, we selected 1nM as a sub-saturated ligand
dose and 20nM as a super-saturated ligand dose.
MCF10A cells were subjected to a transwell migration assay in the presence of the subsaturating and super-saturating concentrations of each EGFR ligand (Figure 4-­‐1b).
Surprisingly, the responses were more varied than the proliferation data would suggest.
First, all three ligands were able to meet or exceed the measured migration of the positive
control (growth media), which was not the case for proliferation. TGFα showed identical
migratory capacity at both the low and high dose and AREG only induced significant
migration at the high dose. Interestingly, EGF induced a greater migratory signal at the
low dose than at the high dose, suggesting a saturational downregulation of signaling
events not present in the responses to other ligands.
142
1.2"
a)#
BRDU"Signal"
1"
0.8"
TGFα"
0.6"
AREG"
0.4"
EGF"
0.2"
b)#
"
CM
"
20
"n
M
"
10
0"
nM
"
50
0"
nM
"
5"
nM
"
1"
nM
SF
M
"
0.
05
"n
M
"
0.
25
"n
M
"
0"
14"
Transwell"Migra5on"
12"
10"
TGFα"
8"
AREG"
6"
EGF"
4"
2"
0"
AM"
1nM"
20nM"
CM"
Figure 4-­‐1: Ligand-­‐Specific Phenotypic Response. a) 24-­‐hour BRDU incorporation. b) 24-­‐hour transwell migration 4.3.2
LC-­‐MS Absolute Quantification EGFR-ligand treatment induces tyrosine phosphorylation of the receptor and many
downstream proteins. Absolute site-specific phosphorylation was measured sites on
EGFR, key adaptor proteins such as Gab1 and Shc, as well as downstream network nodes
such as the ERKs. This data was gathered using the MARQUIS method (introduced in
Chapter 3) with heavy-labeled synthetic peptide standards produced for each
phosphorylation site of interest (Figure 4-­‐2).
143
Sample%1%
Sample%3%
Sample%2%
Sample%5%
Sample%4%
Sample%7%
Sample%6%
Sample%8%
Lyse%cells%and%add%heavy-labeled%phosphopep1de%library%
Reduce,%alkylate,%digest%proteins%and%purify%pep1des%%
iTRAQ%label,%combine%
Phosphotyrosine%immunoprecipita1on%and%IMAC%enrichment%
Targeted%LC/MSMS%analysis%
P%
MS%
P%
P%
MS/MS%
Figure 4-­‐2: Full-­‐Scan MARQUIS method for absolute quantification. Eight samples representing unique cellular stimulation conditions are lysed and heavy-­‐labeled peptide library is included prior to any sample processing. Samples are reduced, alkylated, and trypsin digested prior to C18 peptide purification. Samples are iTRAQ labeled, subjected to phosphotyorisine immunoprecipitation, IMAC enrichment, and targeted LC/MSMS. 4.3.3
EGFR Pattern of Phosphorylation Previous observations indicated an interesting phenomenon regarding the C-terminal
phosphorylation site of EGFR: the relative phosphorylation levels between the sites did
not appreciably change between ligands upon treatment with super-saturating
concentrations. These analyses were repeated with the full-scan MARQUIS method for
EGFR phosphorylation sites Tyr-1045, Tyr-1068, Tyr-1148, Tyr-1173, as well as the
144
additional Tyr-1045/Tyr-1068 dual phosphorylated peptide (Figure 4-­‐3a). Confirming
previous observations, the pattern present at five minutes post ligand treatment (Figure 4-­‐3b) was maintained for all time points up to 30 minutes. In this study time points at two
hours and eight hours post-treatment were included to capture longer dynamics of the
system. The ranked order of phosphorylation intensity remained constant throughout all
time points measured (Appendix Figure 4-­‐7). These data suggest that at saturating doses
the identity of the ligand plays a smaller role in the pattern of phosphorylation. Instead,
the affinity of the ligand for the receptor serves to scale the magnitude of the signal
response.
a)#
b)#
60"
Molecules/cell"(x103)"
6"
Median"Normalized"Moledules"/"cell"
70"
50"
40"
30"
20"
10"
5"
EGFR"pY1045"
4"
EGFR"pY1068"
3"
EGFR"pY1148"
EGFR"pY1173"
2"
EGFR"pY1045pY1068"
1"
0"
0"
TGFα"20nM"
AREG"20nM"
TGFα"20nM"
EGF"20nM"
AREG"20nM"
EGF"20nM"
c)#
45"
Molecules"/"cell"(x103)"
40"
35"
30"
EGFR"pY1045"
25"
EGFR"pY1068"
20"
EGFR"pY1148"
15"
EGFR"pY1173"
10"
EGFR"pY1045pY1068"
5"
0"
0"
5"
30"
TGFα"1nM"
5"
30"
AREG"1nM"
5"
30"
EGF"1nM"
Figure 4-­‐3: Ligand-­‐Specific EGFR Phosphorylation. a) 20nM ligand treatment for five minutes. b) Median normalized 20nM ligand treatment for five minutes. c) 1nM ligand treatment for five or 30 minutes. In contrast, treatment with a sub-saturating (1nM) ligand treatment induced significant
pattern variation (Figure 4-­‐3c). For a five-minute treatment, the phosphorylation pattern
closely mirrored that of the super-saturating ligand dose. However, at 30 minutes poststimulation a very different pattern emerged for all ligands. Tyr-1148 was no longer the
145
most abundant site of phosphorylation as Tyr-1068 met or exceeded it for all three
ligands. This alteration in phosphorylation pattern indicates differential kinase and/or
phosphatase activity at the sub-saturating dose and implies differential recruitment of
adaptor proteins.
4.3.4
Adaptor Protein Patterns of Phosphorylation At a super-saturating dose of ligand, differential phenotypes were still observed (Figure 4-­‐1), but those differences were not readily apparent in the phosphorylation pattern on the
receptor itself. Even if the receptor is presenting nearly identical patterns, the magnitude
of those patterns appear to have a significant effect on the downstream signaling network.
To test this hypothesis, site-specific phosphorylation on adaptor proteins implicated in
EGFR binding and signal transduction cascades were measured. The patterns of two such
adaptor proteins, Shc (Figure 4-­‐4a) and Gab1 (Figure 4-­‐4c), indicate that similar patterns
of EGFR phosphorylation following high dose stimulation can produce different patterns
of adaptor protein phosphorylation. In these cases, AREG presents a lower abundance
version of the EGFR pattern and elicits very different stoichiometry of adaptor protein
phosphorylation.
At a sub-saturating dose, the EGFR phosphorylation pattern is not constant. Not
surprisingly, these differences also correspond to altered phosphorylation patterns on Shc
(Figure 4-­‐4b) and Gab1 (Figure 4-­‐4d). What is more interesting is the comparison
between the patterns elicited super-saturating and sub-saturating dose. In the case of Shc
the super-saturating dose and the sub-saturating dose patterns are quite similar with all
signals being cut roughly in half. Gab1, on the other hand, has almost all of its
phosphorylation sites decreased by 5-10 fold. Perhaps this is due to a higher affinity of
the Shc SH2-domains for EGFR phosphorylation sites. Under this scenario, Shc would
bind more efficiently even when the concentration of EGFR phosphorylated motifs is
lower. Only when EGFR phosphorylation levels are high enough do motifs become
available for a lower affinity binder.
146
a)#
b)#
150"
100"
Shc"pY240"
Shc"pY239pY240"
Shc"pY317"
50"
Molecules"/"cell"(x103)"
Molecules"/"cell"(x103)"
150"
0"
Shc"pY240"
Shc"pY239pY240"
Shc"pY317"
50"
0"
TGFα"20nM"
AREG"20nM"
EGF"20nM"
c)#
TGFα"1nM"
AREG"1nM"
EGF"1nM"
d)#
140"
140"
120"
120"
100"
Gab1"pY259"
80"
Gab1"pY373"
60"
Gab1"pY637"
Gab1"pY659"
40"
Molecules"/"cell"(x103)"
Molecules"/"cell"(x103)"
100"
100"
Gab1"pY259"
80"
Gab1"pY373"
60"
Gab1"pY637"
40"
Gab1"pY659"
20"
20"
0"
0"
TGFα"20nM"
AREG"20nM"
EGF"20nM"
TGFα"1nM"
AREG"1nM"
EGF"1nM"
Figure 4-­‐4: Ligand-­‐Specific Adaptor Phosphorylation. Site-­‐specific Shc phosphorylation following 20nM (a) or 1nM (b) ligand treatment for ten minutes. Site-­‐specific Gab1 phosphorylation following 20nM (c) or 1nM (d) ligand treatment for 10 minutes. 4.3.5
PLSR Predicts TGFα Phenotypic Responses from AREG and EGF Data Using the phosphorylation data as inputs and the phenotypic response data as outputs, we
constructed from AREG and EGF treatment data that was then used to predict the
phenotypic responses of cells treated with TGFα. A one-component PLSR model was
able to explain 66% of the output variance in the AREG and EGF data (Figure 4-­‐5a). This
model was then fed the phosphorylation data of TGFα and asked to predict the
phenotypic output (Figure 4-­‐5b). Despite TGFα phenotypic behavior profile, this simple
model was able to predict outputs with comparable accuracy to the training set data.
147
a)#
b)#
0.2"
80"
0.15"
60"
0.1"
40"
0.05"
20"
0"
0"
0"
1"
2"
Number"of"PLSR"Components"
3"
1.5"
Predicted"
0.25"
100"
Total"Squared"Predic9on"Error"
Percent"Variance"Explained"in"Y"
120"
1"
Migra9on"
Prolifera9on"
0.5"
FIt"Migra9on"
Fit"Prolifera9on"
0"
0"
0.5"
1"
1.5"
Actual"
Figure 4-­‐5: Partial Least Squares Regression Predicts TGFα-­‐induced Migration and Proliferation. a) A one-­‐component PLSR model is able to explain 66% of the observed variance (blue) and minimize the observed prediction error (red). b) Predicted vs. actual values for training set (gray) and test set (blue, red). Analogous models to predict EGF and AREG phenotypic responses were constructed to
various degrees of success (Appendix Figure 4-­‐8). A model for EGF phenotypic response
was comparably effective at producing predictions; a one-component model explained
68% of the output variation and produced errors in line with those of the TGFα
predictions. The difference came when the AREG prediction model was examined. The
one-component model no longer provided the optimal prediction error and the magnitude
of that error was 3-4 fold higher than for other ligands. Upon closer inspection, the low
dose AREG migration prediction is way out of line and drives the larger observed error
(Appendix Figure 4-­‐8b,d). Including this point in the model building stage helps to
explore a different corner of the output space and increase the model fidelity in that
region. When it is excluded, the resulting model has trouble extrapolating that far beyond
the cases in the training set.
4.3.6
PLSR to Model Phenotypes Predicts Stronger ERK2 pY187 Contribution to Proliferation A PLSR model built from the complete data set (Figure 4-­‐6a) shows that ERK2 pTyr-187
was ranked more strongly in proliferation than migration (Appendix Figure 4-­‐9a). This
site was also one of the most abundant phosphorylation sites observed in the absolute
quantification data set (Appendix Figure 4-­‐9b). It is worth noting that other sites, such as
IRS2 Tyr-675 and Shc Tyr-239/240, were also strongly ranked in both of these
categories. The ERK2 Tyr-187 site was chosen, because a very clean tool compound
(UO126) exists that specifically inhibits the upstream kinase (MEK) responsible for
148
phosphorylating this tyrosine residue. MEK is known to specifically phosphorylate
ERK1/217,18 making treatment with UO126 a very directed modification to the cellular
state. Figure 4-­‐6b shows the decrease in ERK phosphorylation when EGF stimulation
follows pretreatment with UO126. Doses as low as 1µM show significant decreases in the
levels of ERK phosphorylation. This suggests that UO126 would have a more
pronounced effect on proliferation than on migration. Figure 4-­‐6c shows the proliferation
in response to EGF treatment in the presence of increasing doses of UO126. The
proliferative activity of the cells drops rapidly and is nearly gone by 5 µM. In contrast,
Figure 4-­‐6d shows the transwell migration under the same conditions. Migration levels
remain strong well beyond 5 µM UO126 and only disappear closer to 10 µM.
a)#
b)#
1.5"
Migra3on"
0.5"
10μM#
5μM#
2μM#
1#
MAPK#pTpY#
Prolifera3on"
Ac.n#
0"
0"
0.5"
1"
1.5"
Actual"
c)#
1.2"
d)#
1.4"
Prolifera3on"
TGFα"20nM"
AREG"20nM"
0.4"
EGF"20nM"
Transwell"Migra3on"
0.8"
0.6"
Migra3on"
1.2"
1"
BRDU"Signal"
1#
1μM#
Predicted"
UO126#
1"
100nM#
+ + + + + + 1#
EGF#20nM#
1"
0.8"
TGFα"20nM"
0.6"
AREG"20nM"
0.4"
EGF"20nM"
0.2"
0.2"
0"
0"
Unt"
2"μM"
4"μM"
6"μM"
U0126"
8"μM" 10"μM"
Unt"
2μM"
4μM"
6μM"
8μM"
10μM"
U0126"
Figure 4-­‐6: Proliferation is more sensitive to UO126 treatment than migration. a) Two-­‐component PLSR model fit to complete data set. b) MAPK phosphorylation following 15 minute pre-­‐treatment with indicated dose of UO126 and five minute treatment with 20nM EGF. c) 24-­‐hour BRDU incorporation: IC90 ~5nM. d) 24-­‐hour transwell migration: IC90 ~9nM. 4.4 Conclusion The method utilized by EGFR to transmit signals about extracellular ligand concentration
to intracellular signaling components has been only partially explored. In particular,
149
presentation of ligand induces phosphorylation on key intracellular residues that form
docking sites for a host of adaptor proteins that perpetuate the signal and help to decide
which signaling pathways are activated. This leads to two mechanisms of how EGFR
uses its distinct phosphorylation sites to encode information that ultimately drive
measureable phenotypic outputs. Pattern modulation of the intracellular phosphorylation
sites was a natural hypothesis, because it explains how stoichiometry between completed
binding motifs would instantiate differential downstream signaling. At low doses of
ligand stimulation different patterns of phosphorylation are observed. These patterns
change throughout time and are ligand-dependent. EGF and TGFα are more similar to
each other than they are to AREG. Affinity differences alone are not sufficient to explain
these differences; otherwise one might expect a scaling phenomenon that could be
overcome by increasing AREG dose. Instead, a structural mechanism is required to
explain how the receptor is able to distinguish the identity of the ligand. Scheck et al.
have demonstrated such a mechanism between EGF and TGFα where juxtamembrane
helices position themselves differently based on the identity of the ligand14.
The idea of ligand specific phosphorylation patterns plays out nicely in the low-ligand
regime. However, when the ligand dose is increased a similar pattern emerges for all
three ligands. What differs is the amplitude of this pattern, and the identity of the ligand
appears less important except to determine affinity. Hence, the pattern is more strongly
presented with EGF and TGFα as these ligands have a higher affinity5,9. The high ligand
regime aligns nicely with a structural hypothesis based on negative cooperativity between
ligand-binding sites in an EGFR homodimer. With a lower affinity ligand such as AREG,
fewer of the receptor dimers would be doubly-bound with ligand15 producing less
efficient signaling.
Pattern modulation at the receptor level with low ligand doses leads very naturally to
differential adaptor protein binding and subsequent phosphorylation. Yet, amplitude
modulation at the receptor level induced by high ligand doses is still able to produce
distinct patterns of phosphorylation at the adaptor protein level. Coupled with differential
binding affinity of the SH2 domains of adaptor proteins for phosphorylated EGFR has
been reported19, the hypothesis emerges that amplitude modulation of an identical pattern
150
may still be able to drive the observed pattern modulations of adaptor proteins. A
stronger affinity SH2 domain will have a stronger binding representation at a lower dose
making it more capable of localization to the receptor. This competitive binding
explanation provides another axis of control in the system.
Decoding the information stored in site-specific phosphorylation is the next big
challenge. In this work, we have shown that testable predictions can be made from these
data sets, specifically a key site (ERK pY187) was predicted to contribute more strongly
to proliferation than migration, which was confirmed using a MEK-specific inhibitor to
bring down phosphorylation. Such conclusions are just the beginning, as absolute sitespecific absolute phosphorylation data opens the doors for new data-driven and
mechanistic modeling approaches.
151
4.5 Appendix Figures a)#
b)#
70"
70"
TGFα"20nM"
50"
40"
30"
20"
10"
40"
30"
20"
0"
0"
1"
70"
5"
10"
30"
120"
480"
EGF"20nM"
0"
1"
5"
10"
30"
120"
480"
70"
70"
60"
Molecules/cell"(x103)"
50"
10"
0"
c)#
AREG"20nM"
60"
Molecules/cell"(x103)"
Molecules/cell"(x103)"
60"
EGFR"pY1045"
69"
50"
EGFR"pY1068"
69"
68"
40"
EGFR"pY1148"
68"
30"
20"
67"
EGFR"pY1173"
67"
EGFR"pY1045pY1068"
66"
10"
66"
0"
0"
1"
5"
10"
30"
120"
480"
65"
0"
1"
5"
10"
30"
120"
480"
Figure 4-­‐7: Similar EGFR Phosphorylation Pattern at High Ligand Dose. Absolute amount of EGFR phosphorylation at indicated time points (minutes) following stimulation. a) TGFα 20nM. b) AREG 20nM. c) EGF 20nM. 152
a)#
b)#
1.5"
1"
Migra3on"
Prolifera3on"
0.5"
Predicted"
1"
Migra3on"
Prolifera3on"
0.5"
FIt"Migra3on"
FIt"Migra3on"
Fit"Prolifera3on"
Fit"Prolifera3on"
0"
0"
0.5"
1.5"
0"
Actual"
0.35"
100"
0.3"
0.25"
80"
0.2"
60"
0.15"
40"
0.1"
20"
0.05"
0"
0"
0"
1"
2"
Number"of"PLSR"Components"
3"
0.5"
1"
1.5"
Actual"
d)#
120"
120"
Total"Squared"Predic3on"Error"
c)#
Percent"Variance"Explained"in"Y"
1"
Percent"Variance"Explained"in"Y"
0"
1.25"
100"
1.2"
80"
1.15"
60"
1.1"
40"
1.05"
20"
0"
Total"Squared"Predic3on"Error"
Predicted"
1.5"
1"
0"
1"
2"
3"
Number"of"PLSR"Components"
Figure 4-­‐8: Phenotypic prediction PLSR models. One-­‐component PLSR models were built to simultaneously predict observed migration and proliferation from phosphotyrosine data a) Model built from AREG and TGFα used to predict EGF. b) Model built from EGF and TGFα used to predict AREG. c) Explained variance and prediction error for EGF model. d) Explained variance and prediction error for EGF model. 1000$
ProliferaDon$$
Rank$Bias$
a)#
500$
0$
!500$
4$
4000000$
3$
3000000$
2$
2000000$
1$
1000000$
0$
0$
EG
EG FR$
F pY
EG E R$p 845
FR GF Y99 $
$p R$ 8m
Y1 pY $
04 10
EG 5pY 45$
FR 10
EG $pY 68$
F 1
EG E R$p 068
FR GF Y1 $
$p R$ 08
S1 pY 6$
14 11
EG 2pY 48$
FR 11
$ 4
ER ER pY1 8$
K1 K1 17
$p $p 3$
T2 Y2
0 0
ER ER 2pY 4$
K2 K2 20
$p $p 4$
T1 Y1
8 8
Ga 5pY 7$
Ga b1$ 187
b1 pY2 $
$p 5
Ga Y37 9$
b1 3m
Ga $pY $
b1 63
HE $pY 7$
R2 65
HE $p 9$
R2 Y8
HE $pY 77$
R 10
IR 2$pY 23$
S2 1
$p 24
Y 8
IR 675 $
S2 m
IR $pY $$
S2 74
$p 2
Sh Y82 $
p2 3$
$ $
Sr pY6
Sh Sh c$pY 2$
c$p c$ 41
Y2 pY2 8$
39 40
pY m
$
Sh 240
c$p m
Y3 $
17
$
b)#
Total$PhosphorylaDon$
(nmol)$
EG
EG FR$
F pY
EG E R$p 84
FR GF Y9 5$
$p R$ 98
Y1 pY m
04 10 $
EG 5pY 45$
FR 10
EG $pY 68$
1
EG E FR$p 068
FR GF Y1 $
$p R$ 08
S1 pY 6
14 11 $
EG 2pY 48$
FR 11
$ 4
ER ER pY1 8$
K1 K1 17
$p $p 3$
T2 Y2
0 0
ER ER 2pY 4$
K2 K2 20
$p $p 4$
T1 Y1
8 8
Ga 5pY 7$
Ga b1$ 187
b1 pY2 $
$p 5
Ga Y3 9$
b1 73m
Ga $pY $
b1 63
HE $pY 7$
R 6
HE 2$p 59$
R2 Y8
HE $pY 77$
R 10
IR 2$pY 23
S2 1 $
$p 24
Y 8
IR 675 $
S2 m
IR $pY $$
S2 74
$p 2
Sh Y82 $
p2 3$
$
Sr $pY
Sh Sh c$p 62$
c$p c$ Y4
Y2 pY2 18$
39 40
pY m
Sh 240 $
c$p m
Y3 $
17
$
!1000$
1000$
800$
600$
400$
200$
0$
!200$
!400$
!600$
!800$
!1000$
Figure 4-­‐9: ERK2 pY187 Contributes More Strongly to Predicting Proliferation than Migration. a) Proliferation rank bias for each phosphorylation site in the complete PLSR model. b) Total amount of phosphorylation observed across all measured conditions. 153
4.6 Appendix Tables Ligand
Unt.
AREG
AREG
AREG
AREG
AREG
AREG
AREG
AREG
AREG
AREG
AREG
AREG
EGF
EGF
EGF
EGF
EGF
EGF
EGF
EGF
EGF
EGF
EGF
EGF
TGFa
TGFa
TGFa
TGFa
TGFa
TGFa
TGFa
TGFa
TGFa
TGFa
TGFa
TGFa
154
EGFR%pY845
EGFR%pY998m
EGFR%pY1045
EGFR%pY1045pY1068
EGFR%pY1068
EGFR%pY1086
EGFR%pY1148
EGFR%pS1142pY1148
EGFR%pY1173
ERK1%pY204
Dose%(nM) Time%(min.) Mean
Range
Mean
Range
Mean
Range
Mean
Range
Mean
Range
Mean
Range
Mean
Range
Mean
Range
Mean
Range
Mean
Range
0
0
1.02E+06
4.13E+04
2.13E+02
4.72E+01
3.70E+02
2.28E+02
4.37E+02
1.60E+02
4.33E+02
5.45E+01
1.33E+03
3.74E+02
1.06E+03
3.69E+02
1.37E+03
5.38E+02
9.79E+02
1.28E+02
4.45E+03
1.23E+03
1
1
6.45E+05
3.68E+04
7.96E+02
9.18E+01
1.93E+03
2.75E+02
8.69E+03
2.18E+01
8.47E+03
3.02E+03
8.92E+03
5.90E+02
4.53E+03
2.04E+03
4.63E+03
1.17E+02
3.03E+02
4.00E+01
1
5
1.31E+06
1.53E+05
8.14E+02
3.67E+02
5.76E+02
1.37E+01
2.40E+03
3.04E+02
3.54E+03
3.56E+02
6.80E+03
8.78E+02
1.25E+04
1.05E+03
3.55E+03
1.75E+03
4.56E+03
2.81E+02
3.40E+04
6.49E+03
1
10
2.32E+05
1.40E+03
8.92E+02
6.50E+01
1.61E+02
9.93E+00
9.30E+02
4.36E+01
6.99E+03
6.43E+02
7.75E+02
1.52E+01
8.77E+03
6.54E+01
1.36E+03
8.29E+01
2.04E+03
3.11E+02
3.12E+04
1.85E+03
1
30
2.29E+06
2.91E+05
3.67E+02
1.49E+02
1.69E+03
4.68E+02
5.78E+02
6.12E+01
1.54E+03
3.09E+02
5.69E+03
4.58E+02
2.22E+03
5.77E+02
1.83E+03
4.03E+02
1.22E+04
7.81E+02
1
120
2.52E+06
3.09E+05
1.22E+02
3.23E+01
8.27E+02
1.93E+02
9.44E+02
2.44E+02
1.89E+03
1.93E+02
2.43E+02
9.54E+00
3.47E+03
7.05E+00
9.86E+02
2.67E+02
1.93E+03
8.86E+01
4.94E+03
2.68E+03
1
480
1.27E+06
3.76E+04
6.64E+02
8.81E+01
4.75E+02
3.87E+01
4.55E+02
4.61E+01
3.47E+03
2.61E+01
8.20E+03
4.54E+02
3.04E+02
1.90E+02
1.36E+03
1.79E+01
1.17E+04
1.34E+02
20
1
1.07E+06
9.70E+04
7.44E+02
1.01E+02
2.32E+03
1.35E+02
4.93E+03
7.31E+01
6.33E+03
1.33E+03
2.09E+04
5.47E+03
3.29E+03
5.35E+02
5.61E+03
3.21E+02
2.60E+02
3.84E+00
20
5
1.56E+06
2.18E+05
4.49E+02
1.14E+02
1.40E+03
8.17E+01
5.58E+03
2.10E+03
9.66E+03
1.42E+03
9.62E+03
1.11E+03
3.37E+04
7.06E+03
4.16E+03
1.92E+03
4.25E+03
3.05E+02
2.18E+04
2.16E+03
20
10
3.91E+05
7.31E+02
1.51E+03
1.22E+02
2.52E+02
1.27E+01
2.22E+03
8.23E+00
1.14E+04
1.12E+03
1.15E+03
2.26E+01
1.62E+04
1.13E+02
2.04E+03
1.84E+01
3.47E+03
5.54E+02
4.60E+04
6.86E+02
20
30
2.99E+06
5.19E+05
3.17E+02
2.25E+01
1.45E+03
4.65E+02
1.12E+03
3.60E+02
2.62E+03
7.84E+02
6.91E+03
3.93E+02
1.78E+03
2.99E+02
3.18E+03
7.08E+02
1.25E+04
9.71E+02
20
120
2.93E+06
5.86E+05
1.27E+02
2.22E+01
8.69E+02
2.17E+02
1.87E+03
5.40E+02
3.03E+03
4.70E+02
2.94E+02
1.03E+01
7.12E+03
1.38E+03
1.13E+03
4.45E+01
2.15E+03
1.17E+02
4.42E+03
2.31E+03
20
480
1.54E+06
3.84E+04
5.05E+02
8.79E+01
5.30E+02
5.45E+01
6.04E+02
7.68E+00
4.56E+03
9.70E+00
7.50E+03
4.43E+02
2.03E+02
6.71E+01
1.02E+03
8.18E+00
1.14E+04
4.08E+01
1
1
3.75E+05
1.18E+05
1.76E+02
4.73E+01
9.90E+02
2.13E+02
7.78E+03
9.87E+02
5.97E+03
2.48E+02
9.91E+02
1.66E+02
1.92E+03
4.73E+02
6.62E+03
1.58E+03
1
5
6.67E+05
3.02E+04
7.12E+02
6.36E+01
6.83E+02
8.05E+01
1.73E+03
5.17E+02
2.41E+03
6.04E+02
1.32E+03
1.09E+02
8.70E+03
1.07E+03
1.50E+03
2.79E+02
2.94E+03
1.29E+02
3.06E+04
4.58E+03
1
10
1.84E+06
3.85E+05
1.19E+02
1.51E+01
1.21E+03
5.68E+02
2.09E+03
7.74E+01
3.26E+03
1.29E+01
2.11E+03
1.75E+02
1.37E+04
1.07E+03
2.61E+03
8.76E+01
2.27E+03
6.42E+00
3.01E+04
1.13E+03
1
30
8.94E+05
2.42E+05
4.88E+02
7.15E+00
4.58E+03
1.51E+03
1.87E+03
1.76E+02
7.21E+03
1.69E+03
4.26E+03
1.79E+01
2.31E+04
1.25E+03
3.43E+02
7.99E+01
3.77E+03
2.14E+02
2.29E+04
3.13E+03
1
120
2.66E+05
5.34E+03
1.17E+03
4.24E+01
3.33E+02
6.84E+01
1.66E+03
7.28E+01
4.57E+03
5.89E+01
2.36E+03
7.06E+01
6.93E+03
6.87E+02
2.80E+02
7.43E+00
2.13E+03
2.53E+01
8.26E+03
3.39E+02
1
480
1.16E+06
1.56E+04
7.95E+01
7.41E+00
6.51E+02
6.96E+01
3.42E+03
1.93E+02
4.20E+03
9.90E+02
2.67E+03
3.17E+02
1.10E+04
5.36E+02
5.98E+02
2.40E+01
1.37E+03
1.21E+02
1.38E+04
9.88E+02
20
1
3.35E+05
1.25E+05
2.83E+03
9.63E+01
3.32E+03
3.68E+02
7.24E+03
1.19E+03
1.14E+04
1.10E+03
3.85E+04
1.81E+03
3.19E+04
4.01E+03
3.17E+03
1.38E+02
6.33E+03
1.21E+03
6.35E+03
1.86E+02
20
5
4.80E+05
2.23E+05
1.64E+03
9.38E+01
4.56E+03
7.02E+02
1.22E+04
8.50E+02
7.43E+03
1.70E+02
6.65E+03
9.50E+01
5.07E+04
3.25E+03
1.52E+03
8.38E+01
1.04E+04
4.56E+02
3.69E+04
1.79E+03
20
10
7.51E+05
2.66E+05
1.67E+03
4.94E+02
4.23E+02
3.67E+01
4.74E+03
1.02E+03
7.89E+03
2.54E+03
1.25E+03
3.39E+02
3.62E+04
1.77E+03
6.65E+03
1.11E+03
7.20E+03
2.74E+02
2.92E+04
6.41E+02
20
30
2.92E+05
1.31E+05
3.12E+02
1.60E+02
5.88E+02
1.32E+02
2.81E+03
5.72E+02
5.04E+03
2.43E+03
3.67E+03
1.20E+03
2.47E+04
1.15E+03
2.08E+03
1.20E+03
7.53E+03
1.56E+02
1.03E+04
1.54E+03
20
120
8.08E+04
1.23E+04
5.26E+02
1.69E+02
2.99E+02
8.42E+01
1.23E+03
2.20E+01
1.65E+03
3.32E+02
1.17E+04
2.18E+03
1.36E+03
1.54E+02
2.03E+03
2.96E+02
4.78E+03
8.96E+02
20
480
2.01E+05
4.03E+03
2.92E+02
2.46E+01
4.42E+01
3.78E+00
3.86E+03
1.08E+02
3.24E+03
5.10E+01
7.04E+03
2.15E+02
1.02E+04
3.09E+02
1.95E+03
8.49E+00
1.05E+04
5.44E+02
1
1
1.01E+06
3.61E+04
5.52E+02
1.95E+02
1.42E+03
1.27E+02
1.55E+04
6.33E+03
9.31E+03
5.56E+01
1.83E+03
2.36E+02
2.66E+03
8.07E+02
1.26E+04
2.61E+03
1
5
1.29E+06
6.72E+04
1.20E+03
7.37E+01
2.88E+03
5.56E+02
1.77E+03
5.64E+02
3.80E+03
6.54E+02
1.69E+03
1.56E+02
1.46E+04
1.43E+03
2.56E+03
4.36E+02
3.12E+03
5.95E+01
4.84E+04
6.39E+03
1
10
3.42E+06
5.90E+05
2.34E+02
1.21E+02
2.33E+03
1.07E+03
1.39E+03
9.88E+01
2.46E+03
4.02E+01
2.35E+03
1.69E+02
1.62E+04
5.34E+03
3.80E+03
1.24E+03
2.54E+03
2.22E+01
3.00E+04
5.15E+03
1
30
7.83E+05
2.68E+05
5.81E+02
2.83E+00
2.39E+03
4.87E+02
1.20E+03
4.42E+01
8.72E+03
1.78E+03
4.85E+03
4.00E+01
1.52E+04
5.18E+02
4.55E+02
5.75E+01
3.35E+03
2.87E+02
3.27E+04
4.85E+03
1
120
3.30E+05
3.15E+03
8.95E+02
3.14E+01
5.28E+02
6.60E+01
1.38E+03
3.76E+01
4.68E+03
2.92E+01
2.33E+03
1.29E+01
6.72E+03
1.91E+02
3.80E+02
9.18E+01
1.64E+03
9.41EH01
7.85E+03
5.79E+01
1
480
8.81E+05
9.01E+02
1.38E+02
1.12E+01
7.86E+02
9.71E+01
3.82E+03
4.85E+02
1.31E+04
2.52E+03
2.45E+03
2.19E+02
1.27E+04
6.25E+02
9.18E+02
1.33E+02
1.34E+03
1.29E+02
1.73E+04
1.04E+03
20
1
6.20E+05
1.35E+05
2.95E+03
1.01E+02
3.67E+03
4.82E+02
7.51E+03
1.22E+03
1.05E+04
6.21E+02
3.74E+04
1.62E+03
3.33E+04
4.53E+03
3.80E+03
9.48E+01
5.36E+03
1.20E+03
6.48E+03
6.29E+01
20
5
8.44E+05
3.63E+05
1.61E+03
6.66E+01
3.94E+03
3.70E+02
1.27E+04
3.14E+02
7.53E+03
2.72E+02
8.00E+03
6.25E+02
5.18E+04
2.15E+02
1.94E+03
4.48E+02
9.11E+03
8.53E+01
3.90E+04
2.44E+02
20
10
9.73E+05
3.72E+05
2.38E+03
6.28E+02
4.95E+02
3.41E+01
7.18E+03
3.50E+02
8.58E+03
3.33E+03
1.10E+03
2.65E+02
2.72E+04
5.03E+03
1.11E+04
3.61E+03
4.89E+03
2.41E+02
3.65E+04
1.20E+03
20
30
2.31E+06
1.25E+06
1.88E+03
1.06E+03
1.91E+03
1.18E+03
4.59E+03
1.25E+03
9.58E+03
4.25E+03
1.23E+04
4.65E+03
2.68E+04
3.22E+03
1.09E+04
7.66E+03
7.68E+03
2.31E+02
2.16E+04
4.99E+03
20
120
6.64E+04
2.03E+03
7.64E+02
1.97E+02
4.59E+02
9.44E+01
1.44E+03
1.45E+02
2.78E+03
1.10E+02
1.08E+04
8.47E+02
1.73E+03
2.49E+02
2.63E+03
5.89E+01
7.77E+03
7.98E+02
20
480
2.06E+05
1.66E+02
2.67E+02
1.71E+01
4.34E+01
4.38E+00
4.44E+03
1.44E+02
2.98E+03
1.72E+01
7.76E+03
2.34E+02
1.13E+04
1.19E+02
1.45E+03
8.19E+00
9.78E+03
1.42E+02
Table 4-­‐2: Site Specific Absolute Quantification Data. Table continued on next two pages. 155
Ligand
Unt.
AREG
AREG
AREG
AREG
AREG
AREG
AREG
AREG
AREG
AREG
AREG
AREG
EGF
EGF
EGF
EGF
EGF
EGF
EGF
EGF
EGF
EGF
EGF
EGF
TGFa
TGFa
TGFa
TGFa
TGFa
TGFa
TGFa
TGFa
TGFa
TGFa
TGFa
TGFa
0
1
1
1
1
1
1
20
20
20
20
20
20
1
1
1
1
1
1
20
20
20
20
20
20
1
1
1
1
1
1
20
20
20
20
20
20
Dose%(nM)
ERK1%pT202pY204
ERK2%pY187
ERK2%pT185pY187
Gab1%pY259
Gab1%pY373m
Gab1%pY637
Gab1%pY659
HER2%pY877
HER2%pY1023
HER2%pY1248
Time%(min.) Mean
Range
Mean
Range
Mean
Range
Mean
Range
Mean
Range
Mean
Range
Mean
Range
Mean
Range
Mean
Range
Mean
Range
0
2.19E+03
1.81E+02
1.07E+03
2.03E+02
1.36E+03
2.68E+02
7.12E+02
8.05E+01
2.48E+03
2.71E+02
7.02E+02
1.79E+02
1.62E+04
3.41E+03
2.21E+02
1.29E+02
1.60E+04
6.61E+03
2.75E+03
5.37E+02
1
5.62E+02
1.82E+02
1.09E+03
1.96E+01
5.72E+02
8.19E+01
2.90E+03
9.74E+02
8.67E+02
2.76E+01
1.15E+03
2.23E+02
1.31E+04
1.78E+03
7.30E+02
2.38E+02
2.48E+04
4.30E+03
6.62E+03
6.78E+02
5
8.36E+04
3.02E+03
2.25E+03
4.65E+02
1.73E+03
2.19E+02
1.25E+03
9.76E+01
3.49E+03
3.40E+02
2.01E+04
5.46E+03
3.94E+02
8.21E+01
9.56E+03
2.20E+02
6.31E+03
1.78E+03
10
6.24E+02
5.41E+01
1.68E+05
3.94E+03
2.21E+03
1.48E+01
2.29E+03
1.46E+02
4.79E+02
5.10E+01
2.18E+03
1.58E+02
6.36E+04
4.57E+03
2.16E+02
1.10E+01
8.98E+03
4.17E+02
1.91E+03
7.66E+01
30
1.30E+03
1.12E+02
2.27E+04
2.10E+03
1.86E+03
9.05E+01
4.94E+02
7.91E+01
1.45E+03
2.21E+02
3.08E+03
1.95E+02
1.13E+04
1.00E+02
7.41E+02
1.32E+02
2.04E+04
5.60E+03
3.79E+03
1.16E+03
120
1.57E+03
2.64E+02
5.40E+03
4.61E+01
3.09E+03
2.52E+02
8.51E+02
2.32E+02
1.26E+03
7.49E+01
5.76E+02
1.33E+02
1.43E+04
1.21E+03
6.72E+02
8.27E+01
1.48E+04
6.36E+03
3.66E+03
6.17E+02
480
1.05E+03
1.88E+01
4.06E+04
5.25E+02
1.96E+03
2.13E+01
4.36E+03
5.77E+02
3.44E+03
2.25E+00
1.98E+03
1.41E+02
9.22E+03
2.54E+02
5.93E+02
5.61E+01
1.41E+04
2.53E+03
3.92E+03
5.31E+02
1
4.67E+02
6.54E+01
1.14E+03
6.08E+00
5.17E+02
5.27E+01
2.86E+03
3.09E+02
9.61E+02
1.38E+01
2.49E+03
1.92E+02
1.40E+04
1.19E+03
5.89E+02
1.45E+02
1.45E+04
2.97E+03
5.94E+03
7.09E+02
5
8.50E+04
5.53E+03
1.70E+03
2.17E+02
1.74E+03
3.28E+02
1.16E+03
4.38E+01
3.26E+03
1.23E+03
1.63E+04
3.87E+03
2.92E+02
7.78E+01
8.95E+03
1.96E+02
4.22E+03
1.32E+02
10
9.78E+02
6.85E+01
1.88E+05
8.75E+03
2.66E+03
4.59E+01
3.62E+03
1.96E+02
7.53E+02
8.52E+01
4.42E+03
1.79E+02
1.21E+05
9.52E+03
3.06E+02
3.27E+01
3.48E+04
7.63E+02
2.72E+03
3.83E+01
30
1.06E+03
2.46E+02
2.22E+04
2.43E+03
1.53E+03
5.86E+01
4.36E+02
9.79E+01
1.39E+03
1.53E+02
2.32E+03
4.81E+02
1.19E+04
2.30E+03
3.36E+02
6.22E+01
2.16E+04
7.01E+03
4.00E+03
2.27E+02
2.51E+03
4.90E+02
7.21E+03
3.48E+02
3.49E+03
3.47E+02
8.45E+02
1.86E+02
1.26E+03
5.45E+01
1.60E+03
3.21E+02
1.30E+04
4.88E+02
7.80E+02
5.56E+01
2.02E+04
9.82E+03
4.78E+03
5.99E+02
120
480
1.19E+03
2.84E+01
4.10E+04
7.30E+02
1.50E+03
8.69E+01
3.09E+03
1.77E+02
3.17E+03
4.95E+01
1.44E+03
1.40E+01
7.56E+03
9.26E+01
3.77E+02
1.97E+01
1.34E+04
1.81E+03
4.65E+03
1.48E+02
1
5.62E+03
9.02E+02
2.28E+03
2.08E+02
2.41E+03
5.58E+02
8.39E+04
1.13E+04
9.67E+02
2.76E+00
1.39E+03
2.14E+02
1.17E+05
2.44E+04
2.92E+03
7.18E+02
5
1.80E+03
2.91E+02
7.59E+04
8.78E+03
2.99E+03
6.50E+02
6.43E+04
1.29E+04
2.59E+03
2.07E+02
3.08E+03
5.54E+02
7.52E+04
2.78E+04
1.72E+03
4.29E+02
4.14E+03
4.14E+03
3.95E+03
8.08E+02
10
1.59E+03
2.03E+02
6.66E+04
1.20E+03
2.46E+03
4.05E+01
9.95E+03
1.46E+03
1.25E+03
2.71E+01
2.90E+03
1.75E+02
7.50E+03
1.28E+02
8.61E+02
5.73E+01
1.16E+04
8.31E+02
3.08E+03
2.91E+02
30
4.47E+02
6.43E+01
1.68E+05
2.80E+03
3.29E+03
1.99E+02
5.26E+03
2.44E+03
4.68E+03
1.60E+02
1.78E+03
3.66E+02
2.61E+04
1.80E+03
9.22E+02
2.17E+02
2.44E+03
1.07E+02
120
2.59E+02
2.35EI01
2.22E+04
1.88E+03
1.93E+03
4.83E+01
2.31E+04
8.31E+02
4.72E+02
1.55E+01
1.01E+03
2.55E+01
2.45E+04
8.25E+02
1.87E+02
1.47E+00
2.44E+03
1.01E+02
480
7.58E+02
1.02E+01
5.67E+04
3.05E+03
1.78E+03
1.41E+02
1.93E+03
9.04E+01
2.79E+03
7.91E+01
2.67E+04
4.23E+02
3.09E+02
1.12E+01
1.66E+04
6.61E+02
2.91E+03
5.22E+02
1
1.41E+03
1.03E+02
4.76E+04
2.05E+03
2.61E+03
7.33E+01
1.14E+04
1.45E+02
3.19E+03
3.81E+02
4.68E+03
6.33E+01
3.19E+04
2.06E+02
1.38E+02
1.15E+01
3.75E+03
6.02E+02
3.90E+03
3.22E+02
5
1.76E+04
1.22E+03
7.33E+04
8.42E+02
5.65E+03
2.62E+02
1.07E+04
3.95E+02
1.06E+03
1.95E+01
3.39E+03
2.55E+02
2.51E+04
2.17E+03
3.84E+02
2.22E+00
2.49E+03
1.38E+02
8.59E+03
5.62E+02
10
1.72E+04
1.60E+03
1.30E+05
1.78E+03
4.68E+03
1.45E+02
6.62E+04
3.94E+02
1.46E+03
1.07E+02
2.17E+03
2.70E+02
5.04E+04
6.49E+03
6.85E+01
6.79E+00
1.01E+04
1.32E+03
3.85E+03
2.12E+01
30
2.23E+03
4.50E+02
4.69E+04
3.55E+03
3.77E+03
1.28E+03
1.02E+04
4.43E+03
7.80E+02
4.31E+00
1.44E+03
1.91E+01
2.57E+04
5.76E+02
1.93E+02
1.12E+02
1.37E+04
3.84E+03
1.85E+03
1.04E+03
120
1.50E+03
3.78E+02
1.17E+04
5.81E+02
1.07E+03
1.93E+02
3.63E+04
1.09E+04
3.40E+03
1.03E+03
3.55E+04
5.38E+03
8.94E+01
1.41E+01
1.69E+03
5.72E+02
480
5.07E+03
5.13E+02
2.29E+04
1.49E+03
1.98E+03
3.12E+01
1.78E+03
1.81E+02
1.17E+03
2.64E+00
1.74E+03
8.49E+01
2.19E+04
3.46E+02
4.24E+01
2.34E+00
8.60E+03
1.21E+02
3.62E+03
1.22E+02
1
7.48E+03
1.62E+03
2.51E+03
3.50E+02
3.58E+03
8.06E+02
9.63E+04
2.12E+04
1.14E+03
2.38E+01
1.90E+03
4.14E+02
1.32E+05
2.69E+04
5.52E+03
1.28E+03
5
3.24E+03
5.43E+02
1.94E+05
4.08E+04
5.89E+03
1.48E+03
1.00E+05
7.20E+03
3.50E+03
2.67E+02
4.14E+03
5.52E+02
1.62E+05
6.42E+04
2.74E+03
1.02E+02
5.55E+03
5.55E+03
3.49E+03
4.59E+02
10
1.80E+03
3.73E+02
6.94E+04
8.73E+02
3.47E+03
8.36E+02
1.17E+04
2.71E+03
1.30E+03
3.74E+01
2.96E+03
2.54E+01
6.87E+03
1.83E+02
1.04E+03
3.07E+02
1.01E+04
3.72E+02
3.01E+03
5.56E+02
30
5.02E+02
8.10E+00
1.69E+05
5.99E+03
4.49E+03
7.37E+02
8.81E+03
6.56E+02
4.74E+03
2.05E+02
1.99E+03
2.52E+02
3.82E+04
2.59E+02
1.23E+03
4.48E+01
5.54E+03
5.60E+01
120
2.20E+02
1.83E+01
2.60E+04
1.53E+03
2.32E+03
1.77E+02
2.67E+04
1.63E+03
4.32E+02
4.16E+01
9.13E+02
1.20E+02
2.70E+04
1.37E+03
2.06E+02
9.26E+00
3.15E+03
1.05E+02
480
9.47E+02
7.53E+01
8.24E+04
7.17E+03
1.99E+03
9.92E+01
2.60E+03
5.46E+01
1.87E+03
1.21E+02
2.86E+04
1.90E+02
2.85E+02
1.86E+01
9.90E+03
8.74E+02
3.14E+03
4.88E+02
1
1.52E+03
7.81E+01
4.77E+04
3.35E+03
2.93E+03
5.55E+01
1.10E+04
3.13E+02
3.12E+03
4.16E+02
4.34E+03
2.25E+02
2.90E+04
5.06E+02
2.00E+02
2.25E+01
3.09E+03
4.53E+02
5.11E+03
5.24E+02
5
1.95E+04
1.58E+03
7.29E+04
2.97E+03
5.60E+03
4.03E+02
1.21E+04
3.89E+02
1.12E+03
5.72E+01
3.89E+03
7.06E+01
2.11E+04
4.54E+02
4.61E+02
3.38E+01
2.42E+03
1.34E+02
8.32E+03
2.31E+02
10
1.96E+04
4.55E+02
1.70E+05
1.80E+03
5.20E+03
2.41E+02
1.00E+05
2.72E+03
1.42E+03
9.41E+00
2.38E+03
2.85E+02
6.19E+04
2.37E+03
8.54E+01
9.14E+00
1.43E+04
6.55E+02
3.57E+03
5.69E+02
30
6.56E+03
3.37E+03
7.74E+04
4.93E+03
1.32E+04
6.95E+03
3.85E+04
2.51E+04
9.56E+02
1.24E+01
3.11E+03
6.54E+02
3.22E+04
1.52E+03
7.18E+02
4.72E+02
2.40E+04
1.21E+04
6.54E+03
3.67E+03
120
1.87E+03
2.83E+02
1.81E+04
9.11E+02
1.26E+03
3.29E+02
3.87E+04
7.10E+03
3.88E+03
6.62E+02
4.45E+04
2.25E+03
1.47E+02
1.12E+01
2.99E+03
8.04E+02
480
4.71E+03
4.31E+02
2.34E+04
1.82E+03
1.90E+03
2.35E+01
1.70E+03
1.20E+02
1.13E+03
2.03E+01
1.58E+03
6.82E+01
2.13E+04
1.12E+02
3.96E+01
8.46EI01
6.83E+03
4.38E+01
3.31E+03
6.12E+01
Ligand
Unt.
AREG
AREG
AREG
AREG
AREG
AREG
AREG
AREG
AREG
AREG
AREG
AREG
EGF
EGF
EGF
EGF
EGF
EGF
EGF
EGF
EGF
EGF
EGF
EGF
TGFa
TGFa
TGFa
TGFa
TGFa
TGFa
TGFa
TGFa
TGFa
TGFa
TGFa
TGFa
IRS2%pY675m%
IRS2%pY742
IRS2%pY823%
Shp2%pY62
Src%pY418
Shc%pY240m
Shc%pY239pY240m
Shc%pY317
Dose%(nM) Time%(min.) Mean
Range
Mean
Range
Mean
Range
Mean
Range
Mean
Range
Mean
Range
Mean
Range
Mean
Range
0
0
8.24E+03
1.47E+03
1.06E+03
1.00E+02
1.78E+05
1.74E+04
4.46E+03
4.46E+02
1.39E+03
2.33E+02
2.10E+03
3.85E+02
2.08E+03
4.75E+02
1.65E+03
1.83E+02
1
1
1.43E+05
2.96E+03
2.61E+03
2.97E+02
3.36E+04
8.55E+03
1.05E+04
2.67E+03
4.45E+03
3.99E+02
5.02E+03
3.52E+02
1.89E+04
4.53E+03
4.23E+04
3.19E+03
1
5
5.71E+04
3.66E+03
8.39E+02
6.43E+01
3.95E+04
4.50E+03
5.92E+03
1.09E+03
2.37E+03
3.13E+01
2.21E+03
1.82E+02
3.20E+04
1.64E+03
4.18E+04
2.31E+03
1
10
3.29E+04
4.73E+03
3.51E+02
4.76E+01
1.67E+04
2.13E+03
7.07E+03
4.04E+02
2.53E+03
1.32E+02
3.97E+03
4.07E+02
4.20E+04
4.66E+02
4.03E+04
5.45E+01
1
30
6.89E+04
5.69E+03
2.43E+02
2.26E+01
2.95E+05
4.79E+03
4.83E+03
9.66E+02
1.39E+03
2.37E+01
5.95E+03
2.36E+02
1.59E+04
1.58E+03
2.11E+04
1.10E+03
1
120
3.80E+03
1.67E+02
2.01E+02
1.70E+01
1.56E+05
2.28E+04
2.75E+03
4.69E+01
1.24E+03
7.96E+01
8.10E+03
4.71E+01
1.06E+04
1.35E+03
1.21E+04
8.18E+02
1
480
1.03E+05
7.01E+03
2.70E+02
3.84E+01
8.15E+03
1.10E+03
3.61E+03
2.95E+02
1.42E+03
7.18E+00
3.58E+03
2.75E+02
1.00E+04
1.77E+02
3.92E+03
2.22E+02
20
1
1.58E+05
3.79E+03
2.49E+03
9.61E+01
3.39E+04
2.93E+03
1.04E+04
1.55E+03
3.83E+03
2.78E+02
8.41E+03
5.32E+02
4.59E+04
1.15E+04
1.98E+05
2.36E+04
20
5
5.71E+04
1.17E+03
1.05E+03
2.69E+01
4.92E+04
7.87E+03
4.88E+03
1.12E+03
2.18E+03
1.80E+02
3.55E+03
3.94E+02
3.78E+04
9.46E+02
8.34E+04
1.15E+04
20
10
6.68E+04
2.23E+02
4.68E+02
3.85E+01
2.99E+04
3.66E+03
8.23E+03
5.98E+02
3.00E+03
1.26E+02
8.11E+03
5.89E+02
1.47E+05
1.15E+04
8.46E+04
1.34E+03
20
30
5.64E+04
1.75E+03
2.03E+02
7.28E+00
2.19E+05
6.07E+04
4.52E+03
7.91E+02
1.12E+03
1.48E+02
7.20E+03
9.06E+01
1.83E+04
4.43E+03
3.99E+04
4.57E+03
20
120
5.36E+03
1.95E+02
2.89E+02
2.05E+01
2.00E+05
4.21E+03
2.63E+03
1.64E+02
1.34E+03
3.42E+01
1.51E+04
5.59E+02
1.44E+04
2.12E+03
2.05E+04
1.82E+03
20
480
9.66E+04
3.10E+03
2.38E+02
3.18E+01
7.34E+03
3.20E+02
3.68E+03
6.30E+01
1.70E+03
3.21E+01
6.15E+03
8.84E+02
2.04E+04
1.55E+03
1.45E+04
6.52E+02
1
1
2.60E+04
4.94E+03
9.06E+02
4.77E+02
3.47E+04
7.44E+02
2.60E+03
2.60E+02
2.33E+03
4.90E+02
1.04E+04
1.18E+03
3.33E+04
4.27E+03
1
5
3.05E+04
1.46E+03
7.62E+04
7.51E+03
4.15E+03
5.14E+02
1.09E+03
1.78E+02
7.80E+03
1.43E+03
2.46E+04
2.35E+03
5.15E+04
4.24E+03
1
10
6.03E+04
6.70E+03
9.19E+00
7.27EK01
2.90E+05
1.12E+04
4.09E+03
6.04E+02
1.41E+03
1.08E+02
4.48E+03
8.81E+02
4.02E+04
1.08E+04
6.67E+04
2.33E+03
1
30
4.45E+03
2.85E+02
2.96E+05
4.48E+04
4.47E+03
6.83E+02
1.66E+03
2.58E+02
7.54E+03
1.30E+03
1.32E+04
7.26E+02
8.70E+04
4.41E+03
1
120
5.31E+04
5.49E+03
4.23E+02
1.94E+01
6.20E+03
2.93E+02
2.44E+03
6.61E+01
9.66E+03
3.36E+02
2.62E+04
4.68E+01
1.10E+04
2.01E+03
1
480
1.64E+04
1.79E+02
8.08E+01
6.53E+00
1.03E+05
1.63E+03
4.09E+03
5.20E+01
1.50E+03
3.57E+01
1.32E+04
7.00E+02
2.71E+04
5.34E+03
1.53E+04
1.46E+03
20
1
6.93E+04
5.82E+03
2.80E+02
2.03E+01
2.87E+05
2.62E+04
5.89E+03
3.04E+02
1.22E+03
6.01E+01
1.49E+04
1.88E+03
2.45E+04
2.75E+03
1.75E+05
1.57E+04
20
5
3.94E+04
2.79E+03
2.97E+01
1.43E+00
6.56E+04
1.37E+03
3.51E+03
1.99E+02
1.82E+03
7.81E+01
1.09E+04
4.60E+02
4.10E+04
2.03E+03
1.92E+05
9.21E+03
20
10
7.48E+04
1.61E+04
1.66E+05
4.97E+03
5.22E+03
4.33E+01
1.93E+03
2.56E+02
5.84E+03
4.59E+02
5.46E+04
2.72E+03
1.00E+05
1.91E+03
20
30
3.37E+04
5.91E+03
4.66E+02
1.29E+02
2.51E+04
2.53E+03
5.74E+03
5.31E+02
1.30E+03
3.17E+01
1.76E+04
5.37E+03
2.08E+04
2.23E+03
7.92E+04
1.64E+03
20
120
3.00E+03
4.03E+02
6.78E+01
1.95E+01
3.24E+03
3.59E+02
2.75E+03
2.88E+02
5.88E+03
3.25E+02
2.15E+04
1.48E+03
2.17E+04
1.44E+03
20
480
3.93E+04
2.23E+03
2.66E+02
4.40E+00
2.15E+04
1.02E+03
8.69E+03
6.19E+02
1.39E+03
6.60E+01
1.12E+04
2.91E+02
2.82E+04
3.23E+03
2.88E+04
2.41E+03
1
1
3.51E+04
1.55E+03
7.03E+02
7.03E+02
4.81E+04
3.37E+03
3.77E+03
4.20E+02
2.40E+03
3.35E+02
1.84E+04
2.18E+03
3.90E+04
4.48E+03
1
5
5.27E+04
1.13E+03
1.92E+05
3.60E+04
4.98E+03
2.06E+02
1.64E+03
1.91E+02
9.98E+03
1.16E+03
3.02E+04
3.50E+03
5.36E+04
8.69E+03
1
10
6.16E+04
7.31E+03
9.25E+00
1.73E+00
3.09E+05
3.91E+04
4.07E+03
2.40E+02
1.57E+03
9.98E+01
4.96E+03
7.76E+02
3.89E+04
1.17E+04
4.90E+04
1.94E+03
1
30
4.84E+03
3.34E+02
2.12E+05
2.91E+04
4.50E+03
6.64E+02
1.71E+03
2.89E+02
9.76E+03
4.19E+02
2.07E+04
2.25E+02
5.50E+04
9.68E+02
1
120
4.51E+04
2.61E+03
3.41E+02
1.57E+01
6.13E+03
3.11E+02
2.25E+03
1.15E+02
8.26E+03
9.19E+02
1.94E+04
2.20E+02
1.24E+04
1.99E+03
1
480
1.78E+04
3.57E+02
8.51E+01
2.88E+00
7.62E+04
1.50E+02
4.65E+03
7.37E+00
1.29E+03
3.55E+01
6.82E+03
3.77E+02
1.43E+04
2.56E+02
1.62E+04
1.25E+03
20
1
7.59E+04
6.75E+03
2.41E+02
2.04E+01
2.96E+05
3.79E+04
6.39E+03
1.51E+02
1.26E+03
1.17E+02
1.42E+04
9.74E+02
2.57E+04
3.30E+03
1.66E+05
1.76E+04
20
5
4.03E+04
1.13E+03
3.12E+01
2.93E+00
6.95E+04
3.41E+03
3.94E+03
9.59E+01
1.77E+03
3.57E+01
9.99E+03
1.28E+02
4.27E+04
2.20E+02
1.89E+05
2.94E+03
20
10
8.00E+04
1.43E+04
1.67E+05
1.12E+04
5.22E+03
2.40E+02
2.36E+03
6.68E+01
9.19E+03
2.87E+02
4.81E+04
4.98E+02
8.15E+04
3.20E+03
20
30
4.67E+04
6.75E+03
6.00E+02
2.61E+02
5.78E+04
2.76E+04
9.18E+03
1.71E+03
1.50E+03
1.18E+02
4.01E+04
1.26E+04
2.52E+04
2.59E+03
9.14E+04
2.09E+03
20
120
3.32E+03
4.14E+02
7.45E+01
9.67E+00
4.14E+03
4.25E+02
2.40E+03
4.43E+01
1.32E+04
3.26E+02
4.08E+04
4.30E+03
2.43E+04
5.86E+02
20
480
3.70E+04
1.65E+03
2.56E+02
1.74E+01
2.27E+04
9.60E+02
8.69E+03
5.01E+02
1.31E+03
6.74E+01
1.13E+04
2.08E+02
3.47E+04
2.78E+03
3.38E+04
1.83E+03
156
4.7 Bibliography 1.
Bhargava, R. et al. EGFR gene amplification in breast cancer: correlation with
epidermal growth factor receptor mRNA and protein expression and HER-2 status
and absence of EGFR-activating mutations. Mod Pathol 18, 1027–1033 (2005).
2.
Rimawi, M. F. et al. Epidermal growth factor receptor expression in breast cancer
association with biologic phenotype and clinical outcomes. Cancer 116, 1234–
1242 (2010).
3.
Citri, A. & Yarden, Y. EGF–ERBB signalling: towards the systems level. Nat Rev
Mol Cell Biol 7, 505–516 (2006).
4.
Roepstorff, K. et al. Differential Effects of EGFR Ligands on Endocytic Sorting of
the Receptor. Traffic 10, 1115–1127 (2009).
5.
Wilson, K. J., Gilmore, J. L., Foley, J., Lemmon, M. A. & Riese, D. J. Functional
selectivity of EGF family peptide growth factors: implications for cancer.
Pharmacol. Ther. 122, 1–8 (2009).
6.
Lemmon, M. A. & Schlessinger, J. Cell signaling by receptor tyrosine kinases.
Cell 141, 1117–1134 (2010).
7.
Lemmon, M. A., Schlessinger, J. & Ferguson, K. M. The EGFR Family: Not So
Prototypical Receptor Tyrosine Kinases. Cold Spring Harbor Perspectives in
Biology 6, a020768–a020768 (2014).
8.
Pawson, T. & Nash, P. Protein-protein interactions define specificity in signal
transduction. Genes Dev. 14, 1027–1047 (2000).
9.
Wilson, K. J. et al. EGFR ligands exhibit functional differences in models of
paracrine and autocrine signaling. Growth Factors 30, 107–116 (2012).
10.
Hibi, M. & Hirano, T. Gab-family adapter molecules in signal transduction of
cytokine and growth factor receptors, and T and B cell antigen receptors. Leuk.
Lymphoma 37, 299–307 (2000).
11.
Adams, S. J., Aydin, I. T. & Celebi, J. T. GAB2--a scaffolding protein in cancer.
Mol. Cancer Res. 10, 1265–1270 (2012).
12.
Sakaguchi, K. et al. Shc phosphotyrosine-binding domain dominantly interacts
with epidermal growth factor receptors and mediates Ras activation in intact cells.
Mol. Endocrinol. 12, 536–543 (1998).
157
13.
Myers, M. G. et al. IRS-1 activates phosphatidylinositol 3'-kinase by associating
with src homology 2 domains of p85. Proc. Natl. Acad. Sci. U.S.A. 89, 10350–
10354 (1992).
14.
Scheck, R. A., Lowder, M. A., Appelbaum, J. S. & Schepartz, A. Bipartite
Tetracysteine Display Reveals Allosteric Control of Ligand-Specific EGFR
Activation. ACS Chem. Biol. 7, 1367–1376 (2012).
15.
Alvarado, D., Klein, D. E. & Lemmon, M. A. Structural Basis for Negative
Cooperativity in Growth Factor Binding to an EGF Receptor. Cell 142, 568–579
(2010).
16.
Brugge Lab Protocols. at
<http://brugge.med.harvard.edu/Protocols/Protocols.html>
17.
Roskoski, R. MEK1/2 dual-specificity protein kinases: structure and regulation.
Biochem. Biophys. Res. Commun. 417, 5–10 (2012).
18.
Roskoski, R. ERK1/2 MAP kinases: structure, function, and regulation.
Pharmacol. Res. 66, 105–143 (2012).
19.
Jones, R. B., Gordus, A., Krall, J. A. & MacBeath, G. A quantitative protein
interaction network for the ErbB receptors using protein microarrays. Nature 439,
168–174 (2005).
158
5 Conclusions and Future Directions The work in this thesis has focused on improving mass spectrometry-based proteomic
data sets in two key ways. The first is to enhance the quality of data generated with
exploratory MS/MS analysis – a crucial first step in any successful proteomics endeavor.
These experiments seek the full range of proteins or sites of post-translational
modification in the samples of interest. With recent advancements in throughput and
analysis it has become infeasible for users to fully validate each scan. Instead, databases
are used to pre-match spectra to known peptides in the proteome and discard the rest.
Such database searches have become commonplace, but their role as a pre-check is being
forgotten; instead, they are being used as the sole mode of identification. A trained mass
spectrometrist rarely double-checks the vast majority of identifications put forth by these
algorithms. In place of human judgment, statistical methods based on analytically
calculated false discovery rates (FDRs) provide estimated accuracy of the data set as a
whole. Applications exist where this may be appropriate, but time-intensive biological
follow-up experiments need the added certainty that a particular match is indeed correct.
It is with this in mind that the Computer-Assisted Manual Validation (CAMV) package
was developed. The goal was to remove the time-consuming tasks of filtering or
matching peaks in a mass spectrum and engage the trained user only at the final decisionmaking stage; the tasks are divided so that each entity performs those for which it is best
suited. The computer rapidly harvests the relevant information and presents it in a
comprehensive way and the user takes all of this information into account to make the
best decision about whether the pre-matched assignment is correct. The division of labor
has produced drastic decreases in the time required to validate a proteomic mass
spectrometry experiment.
The second method for improving data quality focused on the development of a highthroughput reliable absolute quantification pipeline for proteomics by mass spectrometry.
In general, throughput improvements for these experiments come from differential
labeling techniques that permit multiple samples to be simultaneously analyzed and later
for the data to be deconvolved. Such methods involve heavy-isotope labeling of either
covalently attached tags or the peptides themselves. Absolute quantification relies on the
calibration of observed signal intensity to the amount of peptide present in the original
159
sample. Using standard peptides that are included directly into the sample provides the
safest way to account for sequence-specific purification biases. The method introduced
includes such standard peptides while simultaneously employing a multiplex labeling
technique. The added benefit to this strategy is that multiple doses of standard peptide
may be included into a single analysis, providing a more comprehensive calibration
curve. This method was tested on two instrument platforms. The presumed “gold
standard” triple quadrupole instrument utilized for multiple reaction monitoring provided
a very thorough quality control against which the orbitrap-based full scan method could
be validated. The two platforms both performed exceedingly well in the comparison
studies, providing assurance that the full scan method could be utilized for further
analysis. The full scan method was preferable for several reasons. First, it allows more
comprehensive after-the-fact confirmation of sequence identity. Second, due to the
instrument’s popularity, the technique can be readily reproduced by a large number of
research groups.
As a test case, ligand-induced EGFR tyrosine phosphorylation was used to show the
power of absolute quantification in deciphering modalities of information transfer in
signaling networks. Different ligands are known to produce varied phenotypic responses
despite binding to the same receptor. Hence, ligand identity and/or concentration must be
transmitted through the plasma membrane by the receptor. The interface between the
receptor and intracellular proteins is the set of phosphorylation sites at the C-terminal tail.
Once phosphorylated, these provide docking sites for intracellular proteins, recruiting
them into signaling complexes to further relay information. The encoding of the
information may occur in one of two ways. Either the pattern or the magnitude of
phosphorylation may be altered depending on the identity of the ligand. To better
understand which of these mechanisms was at play, phosphorylation sites on EGFR,
known and putative interacting proteins, and downstream effectors were quantified using
this technique. Initially, a saturating dose of ligand was presented to cells and the sitespecific phosphorylation was measured. Surprisingly, the data were consistent with a
pattern modulation modality. All three ligands presented indistinguishable patterns of
EGFR phosphorylation, but AREG was noticeably weaker than TGFα or EGF.
Interestingly, the pattern of phosphorylation on adaptor proteins was different depending
160
on ligand. This suggested that amplitude differences produced sufficiently different
binding site availability at the receptor to affect adaptor protein recruitment and
phosphorylation. A follow-up experiment with sub-saturating dose ligand treatment told a
different story. These low doses permitted the receptor to establish different patterns of
phosphorylation, which naturally also lead to differential phosphorylation patterns on
adaptor proteins.
The applicability of these tools is not limited to the cases described above. Rapid manual
validation of proteomic mass spectrometry data will provide more reliable conclusions
from initial exploratory experiments. Any application of proteomics will benefit from
more effective initial screening of target proteins or potential pharmaceutical compounds.
Moving forward to targeted experiments or biological follow-up is a time-consuming and
expensive process, so decreasing the false positive rate in proteomic screens will aid
academic and industry researchers alike.
Absolute quantification paves the way for proteomics to address a whole new class of
biological questions. First, simulation-based computation modeling efforts are much
better informed by abundance measurements than they are by relative changes. Natural
representations of signaling networks choose concentrations as variables and rate
equations as parameters. Second, stoichiometric understanding of cellular processes is not
possible without absolute quantification. This plays a very important role when
considering the fraction of molecules that are affected by a treatment condition or
potential therapeutic compound. Third, more experimental conditions can be accurately
compared. Absolute quantification provides a very straightforward way to account for
more conditions than a multiplex technique would support. Data from multiple
experiments can be directly combined and compared without fear of batch effects.
The tools developed in this thesis have set the stage for a number of applications in the
very near future. The first is analysis of signaling networks under the influence of
overexpressed oncogenes. For example, HER2 is a member of the EGFR-family of
receptor tyrosine kinases and is routinely overexpressed in many types of cancer. The
increased kinase activity contributed by the overabundance of this protein presumably
amplifies the signaling events throughout the network and biases the cells towards more
161
extreme phenotypic responses. These phenotypic responses have been observed on the
large scale as increased tumor aggressiveness and invasiveness, contributing to decreased
patient survival. With absolute quantification, a richer picture of the signaling
abnormalities will be possible, which may inform more effective pharmaceutical
intervention. The second is analysis of the effects of pharmaceutical interventions on
cellular signaling. An ideal pharmaceutical compound would voraciously destroy cells
with dominant aberrant signaling and leave normal cells untouched. This ideal is almost
never reached and instead researchers must settle for understanding how the compound
impinges on both normal and tumor cell populations. Studying the signaling in normal
cells paints a picture of how the rest of the patient’s tissues will respond, while cells with
altered signaling networks will simulate tumor tissues. A deeper level of analysis is also
afforded by the studies of altered cells. Oftentimes putative pharmaceutical compounds
do not attain the predicted efficacy once they are moved into animal models. In-depth
studies of the signaling responses for these altered cells gives researchers clues as to how
the cells are compensating. The worst cases are those where the disease develops
resistance after an initial stage of promising remission. Even though the vast majority of
the original population of cells may be sensitive to the drug, the small minority may now
grow out and take over, leaving predominantly resistant tumor cells.
The tools developed here will also aid in the analysis of more directed signaling
measurements following stimulation. Previously, the resolution of time points was
limited by technical variation inherent in the sample preparation protocol. Recent
improvements to the protocol involve flash-freezing samples to ensure more exact
endpoints. These techniques are ushering in experiments aimed at very short time points
following stimulation to permit very early dynamics of the system to be distinguished.
These dynamics will likely involve phosphorylation sites on the receptor itself and only
the most closely associated adaptor proteins. Secondary or tertiary signaling molecules
are left out of the picture. Furthermore, feedback loops responsible for downstream
regulation will also presumably take longer to initiate, thus distinguishing their
contributions. Such fine-scale measurements of the network have not been made
previously and hold significant promise of a more refined network model of early events
in the receptor-initiated signaling cascade.
162
Studying cellular signaling networks and their constituent proteins holds great potential
for understanding disease mechanisms. It is the hope of the author that the work in this
thesis will facilitate discoveries far beyond those discussed or even envisioned.
163
Download