Tools for Investigating Cellular Signaling Networks by Mass Spectrometry By Timothy Gordon Curran B.S. Computer Science California Institute of Technology (2008) B.S. Mechanical Engineering California Institute of Technology (2008) Submitted to the Department of Biological Engineering in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY June 2014 © Massachusetts Institute of Technology 2014. All Rights Reserved. Author: ………………………………………………………………………………………………………………… Department of Biological Engineering May 1, 2014 Certified by: ………………………………………………………………………………………………………… Forest M. White Department of Biological Engineering Thesis Supervisor Accepted by: …………………………………………………………………………………………………………. Karl Dane Wittrup Co-­‐Chair, Graduate Program Committee 2 This Doctoral Thesis has been examined by the following Thesis Committee: Forest M. White, Ph.D. Committee Member and Thesis Supervisor Professor of Biological Engineering Massachusetts Institute of Technology Bruce Tidor, Ph.D. Committee Chair Professor of Biological Engineering and Computer Science Massachusetts Institute of Technology Douglas Lauffenburger, Ph.D. Committee Member Professor of Biological Engineering Massachusetts Institute of Technology 3 4 Tools for Investigating Cellular Signaling Networks by Mass Spectrometry By Timothy Gordon Curran Submitted to the Department of Biological Engineering May 2014 in partial fulfillment of the requirements for the degree of Doctor of Philosophy Abstract Mass spectrometry has become the tool of choice for proteomics. Its unrivaled coverage and reproducibility has positioned it head and shoulders above competing techniques for analyzing protein expression post-translational modification. With the increased popularity comes a flood of new research applications, each with its own biological motivations and goals. To ensure that mass spectrometry-based proteomics can be useful to as many biological questions as possible, it is of utmost importance to ensure high data quality. This research focuses on two general stages of the typical proteomics workflow and introduces tools to facilitate effective target screening, follow-up analysis, as well as more precise measurements. This new pipeline is then demonstrated in a case study of Epidermal Growth Factor Receptor (EGFR) signaling and phenotype prediction. The quantity of proteomic mass spectrometry data available from a single analysis has increased exponentially as new generations of instruments become quicker and more sensitive. This deluge of data leaves many tempted to forego time-intensive manual validation of database identified targets in favor of global data set quality statistics. Particularly in the realm of post-translational modifications, long lists of putative matches are often reported with little or no scan-specific validation. Such practices no longer provide assurance that any single identified target is indeed correct, leaving researchers vulnerable to expending vast resources chasing false positives. The argument is that manual validation is too time-intensive to be carried out for each and every identification. To remedy this problem we have introduced the Computer Assisted Manual Validation (CAMV) software package to expedite the procedure by preprocessing the database results so as to remove the tedious steps associated with the validation task and only recruit human judgment for the final quality decision. This approach has drastically 5 decreased the time required for manual validation; a task that used to take weeks now is completed in hours. Another focus of this research is the development of a multiplex, multisite absolute quantification method, which has improved the quality of quantitative proteomic mass spectrometry data. Absolute site-specific data allows many more biological hypotheses to be directly tested with a single mass spectrometry experiment, including phosphorylation stoichiometry. This technique has been applied to the EGFR system to better understand signaling downstream of three distinct ligands. These ligands all bind the same receptor yet elicit different phenotypes, suggesting differential information processing. The analysis showed unique patterns of receptor phosphorylation present following subsaturating ligand treatment. However, at saturating doses the same pattern of phosphorylation is produced regardless of ligand, but the magnitude of that pattern is still ligand-dependent. In this regime, the adaptor proteins were still able to retain ligandspecific phosphorylation patterns presumably responsible for differential phenotypes. The data set also permitted the identification of signals important for the regulation of only one of the two phenotypes examined. Thesis Supervisor: Forest M. White Title: Professor of Biological Engineering 6 Acknowledgments I am very fortunate to have found such a great fit in the Department of Biological Engineering. This is in no small part due to the mentorship of my advisor Forest White. In his lab, I have progressed from a pure engineer with a computational focus into a more competent scientist capable of designing and executing my own experiments. His willingness to let me develop methods and explore new areas has made my time at MIT a fantastic learning experience. My thesis committee members, Douglas Lauffenburger and Bruce Tidor, have provided invaluable advice throughout my graduate career. I consider myself very fortunate to have such accomplished mentors guiding my scientific growth. Members of the White Lab, past and present, have been instrumental at many stages of this work. Their insight, experience, and advice along the way have positively influenced every aspect of my projects. When I first joined the lab, senior graduate student and post-­‐docs took the time to train me in laboratory techniques ranging from basic to sophisticated. During the formulation stages of my project, their experience and knowledge helped me avoid pitfalls and develop more informative experiments. It is also the members of the White Lab who were the test subjects for early versions of the spectral validation software. Their constructive feedback has helped to improve CAMV to level that it exists today. The MIT community has provided a fantastic environment in which to learn and live. The BE class of 2008 were some of the first people that I met at MIT and they continue to be among my closest friends. The Sidney-­‐Pacific Graduate Community has been another rich source of inspiration throughout my graduate career. The strong sense of community fostered by the strong student government provided me the opportunity to grow as a leader and make lifelong friends along the way. The success of this community is in no small part due to the wonderful housemasters (Roger and Dottie Mark) and associate housemasters (Roland Tang and Annette Kim), whom I got to know very well through my involvement in SPEC. Finally, my friends and family have provided strong support and inspiration. My family ignited my interest in science and engineering from a very young age and gave me the opportunity to pursue it at the highest levels. My friends, from MIT, Caltech, and beyond, have always been there to supply encouragement and much-­‐needed perspective; for that I am supremely grateful. Last, but not least, my fiancée Chelsea has been an invaluable source of love and support throughout our time together. 7 8 Table of Contents Abstract ....................................................................................................................................... 5 Acknowledgments ................................................................................................................... 7 List of Figures ......................................................................................................................... 11 List of Tables ........................................................................................................................... 13 1 Introduction ..................................................................................................................... 15 1.1 General Cancer Background ............................................................................................ 18 1.2 Breast Cancer Specific Background ............................................................................... 19 1.2.1 TKI Inhibitors as Breast Cancer Treatments ...................................................................... 20 1.3 EGFR/HER2 as Target Signaling Pathway ................................................................... 21 1.3.1 Phenotypes ....................................................................................................................................... 23 1.3.2 Ligands ............................................................................................................................................... 23 1.3.3 Key Downstream Signaling Nodes .......................................................................................... 26 1.4 Mass Spectrometry .............................................................................................................. 38 1.4.1 MSMS Peptide Analysis ............................................................................................................... 39 1.4.2 Sample Purification (IP/IMAC) ................................................................................................ 39 1.4.3 Instrument Types ........................................................................................................................... 40 1.4.4 Analysis Types ................................................................................................................................. 47 1.4.5 Labeling Techniques ..................................................................................................................... 50 1.5 Absolute Site-­‐Specific Quantification ............................................................................ 51 1.5.1 Motivation for Absolute Quantification ................................................................................ 51 1.5.2 Previous Methods .......................................................................................................................... 51 1.6 Bibliography .......................................................................................................................... 55 2 Computer Assisted Manual Validation (CAMV) ................................................... 81 2.1 Introduction .......................................................................................................................... 81 2.1.1 Statistical Approaches to Assessing Quality ....................................................................... 83 2.2 Manual Validation of MS/MS Data .................................................................................. 90 2.3 Computer Aided Manual Validation Package ............................................................. 91 2.3.2 Exemplary Application to Tyrosine Phosphorylation Analysis ................................. 99 2.4 Conclusion ............................................................................................................................ 102 2.5 Appendices ........................................................................................................................... 103 2.5.1 Package Availabilty .................................................................................................................... 103 2.5.2 Version Information .................................................................................................................. 103 2.5.3 Mascot Search Parameters ...................................................................................................... 103 2.6 Bibliography ........................................................................................................................ 104 3 Multiplex Absolute Regressed Quantification of Internal Standards (MARQUIS) ............................................................................................................................. 107 3.1 Introduction ........................................................................................................................ 107 3.2 Methods ................................................................................................................................. 109 9 3.2.1 Multiple Reaction Monitoring (MRM): ................................................................................. 110 3.2.2 Parallel Reaction Monitoring (PRM): ................................................................................. 111 3.3 Results ................................................................................................................................... 112 3.3.1 Assessing the requirement for standard peptides for absolute quantification of tyrosine phosphorylation ....................................................................................................................... 112 3.3.2 Multiplex Absolute Regressed Quantification of Internal Standards (MARQUIS)113 3.3.3 Performance of the method .................................................................................................... 115 3.3.4 Isolation Width Effects ............................................................................................................. 119 3.3.5 Absolute Quantification of Tyrosine Phosphorylation in the EGFR signaling network .......................................................................................................................................................... 120 3.4 Conclusion ............................................................................................................................ 124 3.5 Acknowledgments ............................................................................................................. 127 3.6 Appendix Tables ................................................................................................................. 128 3.7 Bibliography ........................................................................................................................ 135 4 EGFR Signaling Pathway Downstream of Ligands ............................................. 137 4.1 Introduction ......................................................................................................................... 137 4.2 Methods ................................................................................................................................. 139 4.2.1 Migration Assay: .......................................................................................................................... 139 4.2.2 Proliferation Assay: .................................................................................................................... 139 4.2.3 Mass Spectrometry ..................................................................................................................... 139 4.3 Results ................................................................................................................................... 142 4.3.1 MCF10A 24-­‐hour Phenotypes ............................................................................................... 142 4.3.2 LC-­‐MS Absolute Quantification ............................................................................................. 143 4.3.3 EGFR Pattern of Phosphorylation ........................................................................................ 144 4.3.4 Adaptor Protein Patterns of Phosphorylation ................................................................ 146 4.3.5 PLSR Predicts TGFα Phenotypic Responses from AREG and EGF Data .............. 147 4.3.6 PLSR to Model Phenotypes Predicts Stronger ERK2 pY187 Contribution to Proliferation .................................................................................................................................................. 148 4.4 Conclusion ............................................................................................................................ 149 4.5 Appendix Figures ............................................................................................................... 152 4.6 Appendix Tables ................................................................................................................. 154 4.7 Bibliography ........................................................................................................................ 157 5 Conclusions and Future Directions ........................................................................ 159 10 List of Figures Figure 1-­‐1: Quadrupole Potential and Ion Dyanamics.. .......................................................................... 45 Figure 1-­‐2: a) Stable regions of U-­‐V space for a particular m/z.. ....................................................... 46 Figure 2-­‐1: Sample complexity prevents identification of peptides from complex samples based solely on accurate precursor mass measurements. .......................................................... 82 Figure 2-­‐2: MASCOT score only provides a rough estimate of the quality of the spectral assignment. ...................................................................................................................................................... 85 Figure 2-­‐3: Variation in percentage of unassigned peaks for a given MASCOT score can also be visualized at the dataset level. ........................................................................................................... 86 Figure 2-­‐4: Examples of spectra that were ultimately accepted into the data set. ..................... 87 Figure 2-­‐5: Exemplar yellow identification where the user changed the identification made by the computer. ............................................................................................................................................ 88 Figure 2-­‐6: Exemplar red identification where the user rejected an identification initially included by the computer. ......................................................................................................................... 89 Figure 2-­‐7: The Graphical User Interface (GUI) for the MATLAB implementation of CAMV. 92 Figure 2-­‐8: Peptide Sequence Identification and Validation Workflow. ......................................... 93 Figure 2-­‐9: All peptide matches following user validation of the dataset. .................................. 100 Figure 2-­‐10: Distribution of user and CAMV decisions versus MASCOT score. ........................ 101 Figure 3-­‐1: Variability of Signal Intensity from technical replicates. ............................................ 112 Figure 3-­‐2: Peptide Specific Signal Intensity. ........................................................................................... 113 Figure 3-­‐3: MARQUIS Method for Absolute Quantification. .............................................................. 115 Figure 3-­‐4: MRM Quantification of endogenous tyrosine phosphorylated peptides. ............ 116 Figure 3-­‐5: Details of PRM Data Collection Method. ............................................................................. 117 Figure 3-­‐6: Quality Comparisons. .................................................................................................................. 119 Figure 3-­‐7: Q1 isolation Width Comparison. ............................................................................................ 120 Figure 3-­‐8: Stoichiometric comparison of site-­‐specific phosphorylation. .................................. 121 Figure 3-­‐9: Ligand Comparison. ..................................................................................................................... 124 Figure 4-­‐1: Ligand-­‐Specific Phenotypic Response. ................................................................................ 143 11 Figure 4-­‐2: Full-­‐Scan MARQUIS method for absolute quantification. ........................................... 144 Figure 4-­‐3: Ligand-­‐Specific EGFR Phosphorylation. ............................................................................. 145 Figure 4-­‐4: Ligand-­‐Specific Adaptor Phosphorylation. ....................................................................... 147 Figure 4-­‐5: Partial Least Squares Regression Predicts TGFα-­‐induced Migration and Proliferation. ................................................................................................................................................. 148 Figure 4-­‐6: Proliferation is more sensitive to UO126 treatment than migration. ................... 149 Figure 4-­‐7: Similar EGFR Phosphorylation Pattern at High Ligand Dose. .................................. 152 Figure 4-­‐8: Phenotypic prediction PLSR models. ................................................................................... 153 Figure 4-­‐9: ERK2 pY187 Contributes More Strongly to Predicting Proliferation than Migration. ....................................................................................................................................................... 153 12 List of Tables Table 3-­‐1: MRM Quantification of endogenous tyrosine phosphorylated peptides. .............. 128 Table 3-­‐2: PRM Quantification of endogenous tyrosine phosphorylated peptides. ............... 129 Table 3-­‐3: Comparison of MRM and PRM quantification for 13 tyrosine phosphorylated peptides. ......................................................................................................................................................... 131 Table 3-­‐4: Ligand-­‐Specific EGFR Absolute Quantification. ................................................................ 133 Table 4-­‐1: iTRAQ labeling scheme. ............................................................................................................... 140 Table 4-­‐2: Site Specific Absolute Quantification Data. ......................................................................... 154 13 14 1 Introduction Engineering design principles teach that “form follows function.” However, evolution does not directly have that luxury. Instead, particular design iterations are arrived at by chance and the success of that design in self-replication determines its longevity. The results of this method of design are seen today in every aspect of biology. An example relevant to this thesis is that of environmental sensing and decision-making based on the observation. Inherently, this requires the concept of outside and inside – which already skips over many iterations of blind-search optimization required to effectively separate space for an organism’s own purposes. Once this partitioning has been accomplished, the organism’s ability to taste the waters of its environment requires a molecule capable of transmitting this information across the barrier. This is where cellular receptors come into play. There are many varieties that sense anything from chemicals to neighboring cells. Many times, the information about the extracellular environment is registered through a binding event to the extracellular portion of the receptor molecule, which drives a change to the intracellular portion. For the purpose of this thesis, the intracellular modification that occurs will be the addition of a group to the side chain of the protein receptor. In particular, the addition of a charged phosphate group to the previously uncharged side chain of a tyrosine residue1. The trick that biology must figure out is how to use this capability to create an advantage. In this case, the advantage comes in the form of information transfer that, when properly read, can clue the cell in to the best course of action. Such an adduct can serve a number of functions including activation/deactivation of the protein’s function, change in the protein’s localization2, or completion of a binding site for a protein complex3. Protein phosphorylation is most common type of post-translational modification (PTM) used for signal transduction4. The problem with any engineering advance is that no good deed goes unpunished. The clever receptor system that the cell has developed to sense the extracellular environment and inform its decisions is susceptible to the same random alterations from which its functions were first created. While the cell has compensatory mechanisms with which to combat some changes, others are just too much for it to contain. The carefully tuned 15 levels of different proteins can be thrown askew by a virus that overexpresses a component in the signal transduction pathway (v-src)5. The optimized receptor, which is only fully functional when ligand is present, can be truncated to a constitutively active form (EGFR VIII)6. It is these sorts of examples that make cancer a disease of entropy. A beautifully orchestrated system can be corrupted from the inside by random mutations to the very same constituents that have served the cell so well. In some sense, it is merely a continuation of the evolutionary process. The one cell that gains the ability to divide without bound will reproduce more than its neighbor. The clonal population can then overrun normal cells to the detriment of the tissue and organism to which it belonged. It is precisely this scenario that makes cancer so hard to treat. The offending growth is the result of mutations to otherwise normal cells from the very same tissue. The growth initially maintains many characteristics identical to the original tissue making it difficult to selectively remove. As more and more controls are lifted, the growth can spread and even send out landing parties to other tissues in the body and colonize distant sites. Many researchers seek to understand the steps to this process so as to develop pharmaceutical inhibitors. The complication is that each cancer is different. Different subclasses of tumors in the same tissue may behave very differently depending on the cell type from which they originated, which mutation they exploit, and a host of environmental factors not completely understood. The work in this thesis falls at the very early stage of this whole process. It works from the idea that in order to effectively treat such a complex disease, one must first thoroughly understand the system. Only then can an effective therapy be envisioned to treat the corrupted system. The system of choice is the Epidermal Growth Factor Receptor (EGFR) signaling pathway, which senses extracellular presence of growth factor ligands, translates those signals into phosphorylation events, and recruits/activates signaling effectors that ultimately alter transcription. These newly activated genes are responsible for cancer relevant phenotypes including growth, differentiation, and migration. Studying the proteins in this network at such a detailed level requires tools capable of specific and reliable measurement. Developments in mass spectrometry have provided 16 just such a tool, which is capable of analyzing protein expression levels and more recently post-translation modifications simultaneously for many proteins of interest. These snapshots of cellular proteomic state provide a broad view giving investigators insight into many coincident processes while avoiding the myopic focus on a single aspect of cellular function. Development of these tools has deluged researchers with new data for which verification and analysis is time consuming and often non-existent. To take full advantage of the capabilities of mass spectrometry-based proteomics, data quality must be a top priority. This work approaches data quality on two fronts: 1) ensuring the quality of proteomic identifications made from exploratory massspectrometry experiments, and 2) designing more informative targeted experiments for answering biological questions and building generalizable models. The first is motivated by the fact that to be ultimately useful, large data sets are often shared between researchers through massive online repositories7. Widespread adoption of such a resource will only occur if the data is reliable and useful. Towards this end, database search engines provide heuristic quality measures along with their identifications8,9, but subsequent verification of those identifications can reveal underpredicted error rates especially involving post-translational modifications10. The second is motivated by utility maximization. Many mass spectrometry-based studies set out to catalogue rather than to explore biological implications. The utility of such a catalogue is limited, because little information about biological involvement can be directly derived. Multiplex labeling techniques have spurred comparison studies between biological conditions giving researchers more flexibility in their experimental design. Data from such experiments have informed computational models effectively predicting cellular behavior11. Adding absolute measurement to mass spectrometry-based proteomics has been attempted to varying degrees12,13 with the overall goal of better understanding the stoichiometry present between proteins and PTMs. High throughput abundance measurements open the door to a whole new class of pertinent comparisons, further extending the biological applications for mass spectrometry-based proteomics. 17 1.1 General Cancer Background In many ways, cancer is a disease of entropy. Slowly but surely throughout a lifetime many small random genetic and epigenetic alterations occur in cells throughout the body – a slow descent into disorder. Biology has many mechanisms of correcting such mistakes, so most of these changes are caught early and rectified. Many of the others are fatal, effectively selecting against the cells that harbor them. However, every once in a while, one of these changes is not fatal and persists through to future generations of cells. When one of these changes confers a selective advantage, the harboring cell may grow faster or proliferate more readily. These cells take resources of neighboring normal tissue, ultimately leading to hyperplasia and, eventually, the formation of a cancerous lesion and the potential for metastasis to other sites in the body. Populations in the developed world live longer by eradicating other preventable causes of death, but have little or no way to prevent the inevitable accumulation of changes that ultimately lead to cancer. According to 2008 GLOBOCAN data, there were 12.7 million cancer cases and 7.6 million cancer deaths during that year14. Of these, 56% of cases and 64% percent of deaths occurred in the developing world. This leaves 44% to occur in populations of the developed world, making their cancer incidence nearly twice as high15. Breaking the diagnosis data down, lung cancer is the most commonly diagnosed cancer worldwide with 1.61 million new cases (12.7% of total), perhaps due to behavioral factors such as tobacco use. Second most commonly diagnosed is breast cancer with 1.38 million (10.9% of total), making it far and away the most commonly diagnosed in women (23% of all diagnoses); more than colorectal and cervical cancer combined. It is not surprising that breast cancer is also the leading cause of cancer death in women (14% of deaths). Third most common is colorectal cancer with 1.23 million cases (9.7% of total). Combining both sexes, the top three most common causes of cancer death are lung cancer (1.38 million, 18.2%), stomach cancer (738,000, 9.7%), and liver cancer (969,000, 9.2%)14. These sorts of population-level statistics convey the pervasive nature of cancer as a global disease, but do little to explain the underlying mechanisms at work. A very comprehensive review of our understanding of cancer development was compiled in 18 2000 by Hanahan and Weinberg in their landmark paper detailing the “hallmarks of cancer”16 and they have recently released an updated version that expands upon their original work17. They revisit the six original hallmarks (1. Sustaining proliferative signaling, 2. Evading growth suppressors, 3. Activating invasion and metastasis, 4. Enabling replicative immortality, 5. Inducing angiogenesis, and 6. Resisting cell death) and expand the list to include two new emerging hallmarks (1. Avoiding immune destruction, and 2. Deregulating cellular energetics) and two enabling characteristics (1. Tumor-promoting inflammation, and 2. Genome instability and mutation). As the title suggests, the authors do a very thorough job of outlining how each of these individual characteristics plays a crucial role in permitting the unchecked proliferation and survival of tumors. Their work provides a fantastic cellular and molecular biological view of cancer that is so often lost in population-level statistics. 1.2 Breast Cancer Specific Background Breast cancer is the most common cancer in women, both in the developed and developing world, accounting for 10% of all newly diagnosed cancers: 1.1 million cases are diagnosed and 410,000 patients die each year18. These staggering numbers hide the fact that breast cancer is an intensely heterogeneous disease. In a 2011 review, five classifications were presented: Luminal A (ER+, PR+/-, HER2-), Luminal B (ER+, PR+/-, HER2+), Basal (a.k.a. “Triple Negative” ER-, PR-, HER2-), Claudin Low (ER-, PR-, HER2-), and HER2 (ER-, PR-, HER2+); along with traditional models that fall into these classes19. Each class comes complete with its own unique treatment susceptibility and patient prognosis. A 2006 survey of 51 breast cancer cell lines sought to catalogue genomic and molecular characteristics of breast cancer cell lines20. The work showed that, although breast cancer can be a remarkably heterogeneous disease, common themes in genomic and proteomic abnormalities do arise. They concluded that genomic heterogeneity and copy number abnormalities were similar between established cell lines and primary tumors, but cell lines tend to carry more aberrations and higher levels of overexpression than their primary tumor counterparts20, perhaps this is due to bias in cell line establishment towards late stage disease20,21. Basal cell lines also show segregation not seen in primary tumors22, which may be due to contamination in primary tumor samples or selection bias during culture20. To further complicate the picture, some cell 19 lines previously characterized as breast cancer are now of questionable lineage. For example, MDA-MB-435, previously labeled a spontaneously metastatic breast cancer cell line, may actually be derived from a melanoma metastasis19. This work considers the disease at its earliest stages, when molecular signaling events reflect cells that are almost normal having only undergone the initial steps toward cancer. After multiple genetic changes have led cells too far astray, the hope of treating their signaling abnormalities becomes unlikely. For this reason, the MCF10A cell line has been chosen. MCF10A cells are epithelial in nature, diploid, and derived from spontaneously immortalized cells in long term culture of fibrocystic cells removed during a mastectomy23. They have high EGFR surface expression (~100,000 copies/cell) making them receptive to a family of EGFR-specific ligands known to drive phenotypically cancerous behavior. MCF10A cells are known to migrate differentially in response to these ligands24, which provides a convenient downstream readout on cellular behavior. Numerous studies have argued for25 and against26,27 the prognostic value of EGFR status in breast cancers. EGFR gene amplification has been reported to occur in 6-18% of breast tumors28,29 and co-occur with EGFR protein overexpression28. EGFR overexpression also alters the chances that a breast tumor will overexpress HER2; raising the likelihood from 16% to 26%29. The MCF10A cell line also does not overexpress HER2, which provides an attractive background to investigate the effects of this particular genetic manipulation common to many breast cancers. 1.2.1 TKI Inhibitors as Breast Cancer Treatments Small molecule tyrosine kinase inhibitors are the cornerstone of intracellular pharmaceuticals targeted at decreasing EGFR/HER2 driven signaling30,31. Molecules such as Gefitinib are ATP analogs seek to inhibit EGFR activity by blocking the kinase’s ATP-binding pocket, thus preventing phosphate transfer to the substrate32. Gefitinib (a.k.a. Iressa, ZD1839) was designed to bind specifically to EGFR with high affinity (IC50 ~30nM), but not to HER2 (IC50 ~10µM) or other kinases in the EGFR signaling pathway (Raf, MEK-1, or MAPK where IC50 >10µM)33. Preclinical studies with tamoxifen-resistent cells show that 1µM Gefitinib treatment corresponded to loss of EGFR phosphorylation as well as decreased HER2 phosphorylation32. The same cells 20 also show drastic decreases in MAPK phosphorylation and cell growth when treated with Gefitinib34. Lapatinib (a.k.a. Tykerb, GW 2016) is a dual specificity tyrosine kinase inhibitor designed to bind both EGFR (IC50 10.2nM) and HER2 (IC50 9.8nM) with growth inhibition IC50 values <160nM for EGFR- and HER2-overexpressing cell lines35. 1.3 EGFR/HER2 as Target Signaling Pathway In 1962, Stanley Cohen reported that purified extracts from mouse salivary glands could induce gross anatomical changes far beyond those previously observed for nerve cells. Treatment with the extract promoted precocious eye opening and tooth eruption in newborn mice36. He was able to isolate a factor with demonstrable biological effects that he termed “tooth-lid factor”. In the decades that followed, tooth-lid factor was renamed Epidermal Growth Factor (EGF) and its cellular membrane receptor was identified37,38. Upon treatment of A-431 cells with EGF the EGF Receptor (EGFR) was observed to become tyrosine phosphorylated and the kinase competent EGFR complex was shown capable of phosphorylating the purified receptor39. Putting these facts together, Cohen and King proposed that EGFR was capable of both extracellular ligand binding and intracellular kinase activity37 giving EGFR the distinction of being the first Receptor Tyrosine Kinase (RTK). Not long after, Ullrich et. al confirmed close sequence homology between EGFR cDNA and v-erb-B by cloning the complete EGFR cDNA40, building upon prior partial sequencing work41. In the same year, the C-terminal tyrosines were confirmed as autophosphorylation sites and some were shown to be lacking in the kinase-dead v-erb-B due to truncation42. In 1989, Honneger et. al produced evidence that receptors phosphorylated each other following ligand stimulation. Using a series of kinase-inactive and ligand-binding-incompetent EGFR mutants, the authors showed that such mutants still became phosphorylated – implicating them as substrates of competent endogenous EGFR molecules in the same cells43. The line of work leading to these discoveries provides several key insights for what is now known about EGFR activation. A model emerged of a surface receptor capable of transmitting information about the cell’s environment through the membrane by phosphorylating key residues on its own intracellular domain. 21 By 1990, the family of growth factors that bound to EGFR had grown to include, among others, Transforming Growth Factor α (TGFα) and Amphiregulin (AREG)44. EGF45 and TGFα46 had both been synthesized completely facilitating large scale production and thereby accelerating further study. NMR studies of these ligands confirmed very similar structures, including placement of six key cysteine residues forming three disulfide bonds, between the two molecules EGF47,48 and TGFα49,50. The exact order of ligand binding and dimerization events is still in debate. It is easiest to picture a situation where a ligand binds a receptor, taking it to an active conformation capable of dimerization with another ligand-bound receptor. This simplistic image is challenged by concave-up Scatchard plots suggesting low- and high-affinity binding sites on EGFR and the computational models that attempt to explain them51. Recent work provides a mechanistic explanation: negative cooperativity in EGF binding52, which requires that trimers (one ligand, two receptors) exist. These trimers do not bind an additional EGF ligand with the same affinity as the first. Structural evidence of trimers as well as asymmetric tetramers (two ligands, two receptors) exists for drosophila EGFR (dEGFR) binding its ligand (Spitz)53. The authors posit that differential ligand affinities may stabilize different receptor-ligand configurations, which ultimately explain the phenotypic differences observed following ligand treatment53. The juxtamembrane helices of EGFR also play a role in the dimerization process. Initial motivation for this connection was driven by the observation that a single HER2 pointmutation was associated with cancer54, but studies mutating analogous residues in EGFR have not been shown to affect EGFR function55. Recent work with bipartite tetracysteine reagents have detected ligand-specific structural differences in these juxtamembrane helices56, suggesting a mechanism for differential network activation and phenotype manifestation. The complexity of EGFR signaling is not constrained to the ErbB family. As early as 1986, studies were under way to test the cross-reactivity of receptor tyrosine kinases, including EGFR. The studies showed that EGF-treated EGFR was capable of strong and specific tyrosine phosphorylation of progesterone receptor (PR), while platelet-derived growth factor receptor (PDGFR) did not57. Further results of this same study showed that 22 the insulin receptor (IR) was also capable of phosphorylating only one of the two PR subunits, thus lending additional weight to the hypothesis that cross-talk between receptors was indeed a controlled process that carried the potential for additional information encoding. In the years since EGFR’s initial characterization, a more comprehensive view of signal transduction initiation has been constructed. Ligand binding to the receptor stabilizes the ‘open’ conformation of the extracellular domain of the receptor and facilitates receptormediated dimerization58-60. In the absence of ligand, the receptor extracellular domain is in a ‘tethered’ conformation prohibiting dimerization58,61,62. Once dimerization has occurred, the intracellular kinase domains of the two receptors are in close enough proximity to reliably transphosphorylate the C-terminal tails58. The completed pTyr motifs then provide docking sites for proteins containing SH2-domains. Jones et. al extensively tested known SH2-domains from putative interacting adaptor proteins to better understand the complexity of the second-layer signaling events63. These recruited adaptor proteins provide their own set of motifs that can further recruit additional proteins such as kinases, phosphophatases, and other critical signaling proteins indirectly to the phosphorylated receptor. 1.3.1 Phenotypes EGFR sits atop a signaling network that ultimately controls several cancer-relevant phenotypes such as proliferation64,65, migration66,67, and differentiation68. These key phenotypes are all the result of ligand binding events at the plasma membrane, yet the intermediate signal processing leads to ligand-specific phenotypic responses24,66. A very nice study elucidating similar ligand-dependent behavior in the context of receptor trafficking was carried out by Roepstorff et. al and showed that different EGFR ligands were capable of driving differential fates for EGFR69. The connections they make between ligand properties and upstream signaling events provide the tip of the iceberg for understanding the ultimate behavior of the overall cellular phenotypic response. 1.3.2 Ligands Of the eight ligands known to bind EGFR, this work focuses on three ligands that have been shown to bind exclusively to EGFR and no other HER-family member: EGF, TGFα, 23 and AREG70,71. These ligands share a common cysteine pattern in their primary structure (CX7 CX4–5 CX10–13 CXCX8 C) as well as forming identical cysteine bridges C1-C3, C2-C4, C5-C671. Although some of the ErbB family members have been shown capable of juxtacrine signaling72 most ligand signaling is, presumably, the result of autocrine and paracrine signals after shedding from cells by cleavage of transmembrane protein precursors71. The shedding process is mediated by members of the ADAM family of matrix metallo-proteases; ADAM 10 being responsible for the release of EGF and ADAM 17 for AREG and TGFα73. More recently, other soluble metalloproteinases have been shown to cleave EGFR ligands in organ specific contexts74,75. Despite using the same cellular receptor, these ligands have varied effects on cellular phenotypes. This is true even when doses are saturating, suggesting intrinsic differences in ligands beyond affinity70. At the protein level, higher levels of EGFR 1045 phosphorylation have been observed with EGF treatment as compared to AREG, and the Y1045F mutation potentiates EGF’s effect more than AREG’s76. EGFR pY1045 is a putative binding site for Cbl (a ubiquitin ligase)77, so it is not surprising that the rate of receptor internalization and recycling decision depends strongly on the identity of the ligand. It has been shown that EGF leads to greater degradation following internalization whereas TGFα favors recycling and AREG never gets internalized in the first place69. In MDCK cells, AREG is associated with a redistribution of E-cadherin and induction of a spindle-like morphology whereas TGFα treatment causes no such changes78. 1.3.2.1 EGF When EGF binds EGFR it stabilizes an ‘open’ confirmation of the extracellular domains for dimerization with another ErbB family member open molecule. The binding event also induces changes in the intracellular portion of the receptor molecule activating the kinase domain, which phosphorylates the partner molecule of the dimer79. These phosphorylation sites complete binding motifs for a wide range of SH2- and PTB-domain containing proteins, which are subsequently recruited to the receptor3,80. For instance, Cbl (an E3 ubiquitin ligase) is most efficiently recruited to a motif containing EGFR pY104577. These adaptor proteins serve as the first step in a number of signal transduction cascades downstream of EGFR, including Ras, MAPK, Src, STAT 3/5, 24 PLCγ/PKC, and PI3K/AKT81. Each of these pathways has a role to play in cellular phenotypes including proliferation, apoptosis, differentiation, and migration82. Following stimulation with EGF, EGFR is endocytosed and a fraction is degraded. After 10 minutes of EGF stimulation, almost 40% of EGFR colocalizes with EEA1 (an endosomal marker) and after 30 minutes surface levels have decreased by 50%69. It is thought that the rapid downregulation through trafficking limits EGF’s ability to potentiate phenotypic response83,84. 1.3.2.2 TGFα TGFα was first reported originating from three different cultured human tumor lines and identified by its ability to confer a transformed phenotype to untransformed fibroblasts85. TGFα expression has since been mapped to the epithelial cells of many tissues, including: digestive tract, liver, pancreas, kidney, thyroid, adrenal, skin, mammary gland, and genital organs86. TGFα has many similarities with EGF; it is very close in size and structure49,59 and binds EGFR with similar affinity87. Early studies suggested a different binding conformation for TGFα, because TGFα-binding could be inhibited by antibodies raised against EGFR, but EGF-binding could not88. Twenty years later, it was shown that there exist subtle structural differences in the extracellular domain II of TGFα-bound EGFR when compared with EGF-bound EGFR70, enticing the assumption that there is also structural differences present in the intracellular domain of the molecule. Such differences were confirmed in the juxtamembrane helical domains of EGFR using bipartite tetracysteine display56 and lend credence to the hypothesis. TGFα’s difference is most notable in the trafficking patterns it induces in EGFR. TGFα-bound EGFR is endocytosed similarly to EGF-bound EGFR69. Unlike EGF, TGFα’s pH-dependent affinity for EGFR89 permits the ligand-receptor complex to dissociate in the acidic environment of the endosome, ultimately recycling EGFR to the surface instead of divining it for degradation69,89,90. Three addition histidine residues present in the receptorbinding portion are suspected to be responsible for this additional pH-dependence of TGFα91. Mutant EGF containing these histidines shows similar recycling and mitogenic potential to TGFα90,92. 25 1.3.2.3 AREG AREG is a much larger molecule than either TGFα or EGF with 78 amino acids in its truncated form (84 in a non-truncated form). The C-terminal portion of the molecule bears close resemblance to EGF and shares its pH-insensitive affinity for EGFR. The additional N-terminal portion is very hydrophilic and contains a disproportionate number of arginine, lysine, and asparagine residues93. Initial studies placed the EGFR affinity of AREG lower than that of EGF94, subsequent studies show comparable affinity for recombinant ligands95,96, and further work showed that AREG does not induce dimerization as efficiently97. It has also been shown that the heparin-binding domain (residues 26-44) is homologous to that of HB-EGF and has been shown to be crucial to its mitogenic capacity98,99. These potential differences translate into numerous reports that AREG is not as potent at inducing EGFR phosphorylation in cell lines69,100,101. 1.3.3 Key Downstream Signaling Nodes The EGFR signaling network consists of many proteins that serve to interpret the extracellular ligand identity and concentration. Some of these proteins are additional kinases responsible for phosphorylating a unique set of substrates to orchestrate an arm of the network and some are phosphatases responsible for removing those modifications. Others are adaptor proteins that contain phosphotyrosine-specific binding sites that serve to bring multiple signaling components into close proximity. 1.3.3.1 ERK1 and 2 Extracellular Regulated Kinase (ERK) 1 and 2 are members of the Mitogen Activated Protein Kinase (MAPK) family of cytosolic kinases that are activated through a signaling cascade (Ras-Raf-MEK-ERK) downstream of RTKs102, such as EGF-family ligand activation of EGFR as one example. Upon stimulation, the receptor recruits Shc, Grb2, and finally Sos1/2 which serves as the Ras guanine-nucleotide exchange factor (GEF)4. Ras activates Raf (A-, B-, C-Raf) through a dimerization process103, which then activates MEK1/2: the dual-specificity kinases whose only known physiological targets are ERK1/2104. The presumed purpose of this three-tiered architecture is rapid signal amplification4. MEK concentration has been recorded to be 100-fold higher than that of Raf, while ERK is 2/3 that of MEK105. The amplification procedure is able to rapidly induce phosphorylation of 60% of endogenous ERK105. 26 ERKs’ broad substrate specificity gives it the ability to phosphorylate over 175 targets106 in many protein classes, including: transcription factors, kinases, and other enzymes107,108. This wide selection of cellular targets links ERKs’ activities to many phenotypes, such as proliferation109, migration67, differentiation, metabolism and cell survival4. These phenotypes are critical to cancer progression and ERKs’ importance is apparent since many members of the signaling cascade are subject to mutation. Looking upstream, overall 30% of cancers carry a Ras mutation110 and 7% a Raf mutation111. The statistics are even more jarring when subtypes of cancers are considered. KRas is mutated in an overwhelming 58% of pancreatic cancer, 33% of colorectal cancer, and 31% of biliary cancer, while NRas is mutated in 18% of melanomas112. Raf mutations also focus in specific subtypes of cancer, specifically 40-60% of melanomas, 40% of thyroid cancers, 30% of ovarian cancers, and 20% of colorectal cancers111,113. Even when no mutation is present in these core proteins, overactivation of the cascade is common in many cancers4. ERK1 (379 aa) and Erk2 (360 aa) are very similar in that they share 84% of their sequence114, are ubiquitously expressed102, and are activated in parallel under all known conditions109. Differential viability following individual knockdown suggests separate functions of each ERK, but may be due to higher expression of ERK2 in many cells, permitting it to compensate for a loss of ERK1114. Relative activation of the two isoforms is proportional to their expression level115 and whether isoform-specific functions truly exist is still an open question4. Activation loop phosphorylation has proven to be a common regulatory method for protein kinases116. Not surprisingly, phosphorylation of the activation loop of ERK by MEK is required for activity. All MAPKs contain the TxY sequence in their activation loop and for both ERK1 and ERK2 the sequence is TGY. The tyrosine is phosphorylated first and the production of the dual phosphorylated form of ERK is non-processive117,118. Tyrosine phosphorylated ERK dissociates from MEK and then reassociates so that MEK can phosphorylate the neighboring threonine119. Studies with site-specific phosphorylation inhibitors claim that MAPK is only active when both the threonine and 27 the tyrosine are phosphorylated. The kcat of ERK2 has been reported to increase 50,000 upon activation and specificity increases 600,000-700,000 fold120. 1.3.3.2 Akt Akt, also known as Protein Kinase B (PKB), was identified in a cloning screen intended to find proteins with kinase domains similar to PKA and PKC121,122. The Akt protein family consists of three known members; the latter members AKT-2123 and AKT-3124 were identified in similar subsequent studies. PKB isoforms β (AKT-2) and γ (AKT-3) isoforms share 82% sequence identity with the α (AKT-1) isoform124 and are differentially expressed at both the mRNA level and the protein level125-127. PKB α and β isoforms are widely expressed in brain, thymus, heart, and lung121,122,125 whereas the γ isoform is more restricted with high expression limited to brain, testes, lower heart, spleen, lung, and skeletal muscle126. Orthologs have been found in flies128 and worms129, which share similar structure with the exception of the C-terminal tail that is only found in some species and isoforms130. Akt’s key role is as a serine/threonine kinase121,131 responsible for phosphorylating the consensus motif RXRXXS/T132. The phosphorylation motif identified for Akt is present in many apoptotic machinery proteins, as well as transcription factors, allowing it to suppress apoptosis133 and promote survival in concert with PI3K130. c-AKT is recruited to 3’-phosphorylated phospholipids134, the kind produced by activated PI3K135 through its N-terminal PH domain130. This binding event has been shown to be critical for activation of Akt136,137, and it has been suggested that binding induces an intramolecular conformational change that permits activating phosphorylation138. Further support for this hypothesis comes from the fact that Akt truncations lacking the PH domain are still capable of becoming phosphorylated130. The PH domain also serves to recruit Akt to the cellular membrane139 and mediate intermolecular interaction in multimers140, both of which are important for regulation. These interactions bring Akt into close proximity with kinases responsible for phosphorylating Thr-308 and Ser-473 (whereas other phosphorylation sites including Ser-124 and Thr-405 are basally phosphorylated)132. Thr-308 and Ser-473 28 phosphorylation events are critical to Akt activation131,139,141 and it has been shown that phosphomimetic mutagenesis at these sites partially activates Akt132. It was soon realized that Akt was the cellular homolog to the transforming oncogene produced by the AKT8 retrovirus – a fusion protein between Akt and a viral gag sequence125,131,142,143. Spatial control is lost with v-Akt, because it is constitutively targeted to the membrane through the viral gag sequence, which indirectly bestows constitutive kinase activity137. The importance of this targeting was confirmed when cAkt was also found to be constitutively active when targeted to the membrane by myristoylation144. It is not surprising that induction of Akt’s kinase activity was first shown in response to growth factors such including EGF137,144. EGFR activation leads to activation of PI3K, which phosphorylates lipids which recruit Akt130. Hence it is not surprising that v-Akt shows significant decreases in ligand-dependence of activation144 as it is able to circumvent the upstream control in the pathway. Akt’s phenotypic implications have been extensively explored and heavily involve survival signaling130. Akt has been shown to be necessary for survival, because a dominant-negative form is able to abrogate IGF-1’s ability to promote survival. In the same cells, wild-type or constitutively active Akt permitted survival even without IGF1145. As Datta et. al point out in their review, constitutively active Akt does more than just promote survival, it has been shown to counteract many apoptotic signals130. The high level goals of survival and evasion of apoptosis are likely composed of many lower level responses, such as metabolic processes where Akt is often implicated146. 1.3.3.3 PI3K The first indirect evidence for a connection between EGFR and phosphoinositide 3kinase (PI3K) was observed when EGF was shown to increase radioactive phosphate labeling of an unusual phosphatidylinositol147. The phosphatidylinositol product turned out to be phosphatidylinositol-3,4,5-triphosphate (PIP3) and has been shown to be a critical second messenger that recruits AKT and governs growth, proliferation, and survival148. The enzyme responsible for catalyzing this reaction is one of a large family 29 collectively known as phosphoinositide 3-kinases (PI3Ks). They are heterodimers consisting of a regulatory subunit responsible for mediating interactions with upstream RTKs paired with a catalytic subunit containing the kinase domain148. The PI3Ks are broken down into classes depending upon which constituent subunits they contain. Class 1A PI3Ks are most strongly implicated in cancer149. Their regulatory subunit (p85α, p55α, p50α, p85γ, p55γ) contains the obligatory p110-binding domain as well as two SH2 domains and an optional SH3 domain, which permit interaction with phosphotyrosines contained in the pYxxM motif as well as proline-rich domains. The catalytic subunit contains a p85-binding domain as well as a Ras-binding domain, a PIK-homology domain, a C2 domain, and a catalytic domain150. Class 1A PI3Ks are subject to frequent genetic alterations, so much so that they have been identified as one of the most disregulated pathways in cancer149. Amplification of PI3K is not the only way that the PI3K/Akt pathway is disregulated. Other modalities include: PTEN (one of the phosphatases responsible for negative regulation of PIP3) loss, Akt mutation, and RTK amplification149. These pathway alterations may occur along or in concert to promote tumorigensis. In particular, PI3K amplification has been shown to coexist with PTEN loss in 8.7% of breast cancers, as well as with HER2 overexpression149. These data have made PI3K a common target for pharmaceutical intervention with ATP analogues151 including pan-specific Wortmannin which forms a covalent bond with a lysine residue in p110α152 as well as LY294002 which utilizes a key hydrogen bond to mimic the ATP interaction153. 1.3.3.4 Shc The first member of the Shc (Src homology and collagen) protein family was discovered while screening cDNA libraries with sequences complimentary to known SH2domains154. More detailed analysis revealed three isoforms (p46, p52, and p66) of what is now known as ShcA155. Isoforms p46 and p52 are widely expressed, while p66 is more controlled and conspicuously absent from certain lineages, such as haematopoietic cells154,156 and simultaneous disruption of p66, p52, and p46 is embryonically lethal155,157. A conserved core structure containing PTB-CH1-SH2 domains are common to all isoforms with differences occurring in the N-terminal section. Isoforms p46 and p52 are 30 produced from alternate start codons within the same transcript, whereas p66 is the result of an alternative splice adding an additional 110 amino acids to the N-terminus. These additional amino acids are rich in glycine and proline providing potential SH3-binding motifs158. Shc was one of the first cellular adaptors to be characterized and serves as a prototypical example of this class of proteins159. Consistent with its role as an adaptor protein, each of the identified domains of Shc is responsible for mediating interactions with other cellular proteins and structures160. Sh2- and PTB- domains are responsible for binding phosphorylated tyrosine motifs providing Shc the ability to interface with a number of key nodes involved in signal transduction pathways including: growth factor signaling, especially Ras/MAPK and PI3K/AKT cascades downstream of RTKs, integrins and cytosolic kinases, as well as cytokine, antigen, and G-protein-coupled receptors159. In additional, the collagen homology (CH) domain is rich is glycine and proline158,161 providing SH3-binding motifs for proteins such as the Src-family kinases159. This domain also contains important tyrosine sites (Tyr-317 as well as the tandem Tyr-239/240) that become phosphorylated upon RTK activation and provide binding sites for other key signaling proteins such as the adaptor Grb2162. Shc is not limited to recruiting kinases and their substrates. There are plenty of phosphatases among its known interacting partners, including: PTP-PEST (which binds through the PTB domain)163,164; PTEN (implicated strongly in migration)165,166; as well as PP2A and PTPε which are known to act on Shc by impeding Grb2 interaction and ultimately MAPK activation167,168. In addition, the PTBdomain of Shc has been reported to perform a similar function to a PH-domain: binding phospholipid169,170. Interestingly, once phosphotyrosine binding is complete, the capacity to bind phospholipid is diminished169. These data paint a picture where ShcA is preemptively bound to the membrane making it readily accessible to activated receptor155,170. Such a working hypothesis is strengthened by studies showing very rapid induction of Shc phosphorylation following ligand treatment. A stopped-flow apparatus was able to capture very short time points and show that EGFR pTyr-1173 phosphorylation occurs within 5s with near immediate recruitment and phosphorylation of Shc171. 31 Shc also binds to other ErbB-family members including HER2. Studies by Dankort et. al investigated the importance of specific sites on HER2 for Shc and Grb2 recruitment and their tumorigenic capacity172,173. Not surprisingly, when the analogous terminal phosphorylation sites of HER2 are removed, HER2 loses the ability to transform cells. Against this background, reconstitution of only one site (Tyr 1227, known to bind Shc) is sufficient to restore transformative capacity, increase MAPK and PI3K signal, and stimulate the development of tumors172,173. In support of ShcA’s important role in the transformation process, ablation of ShcA prevented tumor development155,174. The complex interplay does not end there. To speak to its downregulatory activities, Shc has been linked to EGFR internalization and degradation following ligand stimulation175177 . More specifically, ShcA has been linked to clathrin-coated pit formation through its ability to recruit α-adaptin and ultimately AP-2176. Consistent with this hypothesis, data show that ShcA and EGFR co-localize to EEA1-positive vesicles, likely because Shcbinding sites (pTyr-992 and pTyr-1173) remain phosphorylated during the internalization process175. By the time that early endosomes progress to LAMP-1-positive late endosomes and lysosomes, only a subset of endosomes retain EGFR pTyr-1173 and none co-localize with ShcA175. Less dramatically, phosphorylation of Tyr-317 has been shown to be required for PP2a dissociation167. A similar hypothesis has been posed by a series of molecular dynamics simulations carried out by Suenaga et. al. They implicate pTyr-317 in decoupling concerted movements of distal PTB and SH2 domains by increasing the rigidity of the CH1 domain178. This decoupling can lead to the disruption of interactions between those domains and cognate motifs on EGFR, removing protection of EGFR phosphorylation sites, and facilitating the eventual termination of the transduction process179. This model does not account for concomitant pTyr 239/240155. These types of computation simulations build upon much experimental work that has been done to link Shc to EGFR signal transduction. As early as 1994, binding studies were performed on the SH2-domain of Shc to show that pTyr-1148 and pTyr-1173 were binding sites on the activated receptor180. The work also indicates pTyr-992, which may be necessary to explain why phosphorylation of Shc is induced upon EGF stimulation 32 even when Tyr-1148 and Tyr-1173 have been removed181,182. Further studies suggested that it was indeed pTyr-1148 responsible for dominant engagement of the Shc PTBdomain, while pTyr-1173 provided supplementary binding capability to the SH2- as well as the PTB-domain of ShcA183-185. Co-immunoprecipitation experiments between Shc and EGFR supported the initial hypothesis that these binding events were redundant186, but subsequent work indicated that more robust signaling occurred when the PTB-domain was involved184 supporting the hypothesis that cooperation leads to optimal MAPK activation155,187. The Shc-family contains four members: ShcA, B, C, and most recently D155 and likely arose from a common ancestor161. These homologs all share general domain structure described above. Although the presence of multiple homologs may complicate the picture of Shc signaling, the ShcA cascade is thought to be the ‘redundant but dominant’ path employed by cells to activate MAPK from EGFR155,188. The model used to generate this hypothesis also suggests that indirect Grb2 recruitment through ShcA is preferred to direct recruitment to EGFR188. Experimental verification of the dominance of the ShcA cascade comes from MEF studies where activation of MAPK is possible in the absence of ShcA, but ShcA is crucial for sensitization to low growth factor concentrations157. 1.3.3.5 Gab1 Gab1 (Grb-2 Associated Binder 1) is a scaffolding adaptor protein and as the name suggests was originally discovered in an glial tumor expression library for its ability to bind Grb2189. As the name suggests, most of Gab1’s interactions with receptor tyrosine kinases occur indirectly through intermediates such as Grb2 whose SH2-domain targets Grb2-Gab complexes to RTKs containing the proper binding motif [YXNX]190-192. Even in an instance where Gab1 has been shown to be capable of direct binding to the c-Met receptor, indirect interactions have been shown to be physiologically relevant193,194. Once Gab1 has associated to a receptor it is equipped to facilitate a host of other proteinprotein interactions. Like other Gab-family proteins, Gab1 contains an amino-terminal PH-domain which recruits it to the cell membrane, tyrosine-containing motifs for the recruitment of SH2- and PTB-domain-containing proteins, proline-rich domains for recruiting SH3-domain-containing proteins190,195-199 including PLCγ189, Crk200, 33 CrkL201,202, Shc, SHP2, Ship, and PI3K203. The many phosphorylation sites of Gab-family proteins are often phosphorylated by the RTK’s to which they are recruited, but can also be the target of cellular kinases190,204-206 further increasing the potential complexity of signaling dynamics. Gab1 is part of a family of binding proteins. Gab1, Gab2, and Gab3 have a global homology of 40-50% with higher conservation in the PH-domain, SH2-, and SH3binding motifs199. In simpler organisms the family consists of one member (DOS (Drosophila) and Soc1 (C. Elegans))207, which presumably diversified to provide additional function as organismal complexity increased, thus speaking to its importance in signal transduction. Gab1, in particular, is expressed in many tissues throughout the body with especially high levels observed in the brain, kidney, lung, heart, testis, and ovary189,199. 1.3.3.6 Grb2 Grb2 (Growth Factor Receptor-bound Protein 2) is an adaptor protein initially discovered to link EGFR signaling to the Ras-MAPK pathway and has since been shown to be involved downstream of many distinct RTKs208-210. Grb2 contains one SH2 domain and two SH3 domains208 allowing it to act as a puzzle piece capable of linking many other proteins. Grb2’s most well studied interactions are those surrounding Sos – to which Grb2 is constitutively associated by way of its SH3 domain. Sos is a guanine nucleotide exchange factor responsible for swapping GDP for GTP on Ras. This task is more efficiently accomplished when the Grb2 SH2-domain recruits the Grb2-Sos complex to RTKs, thus bringing it into close proximity with membrane associated Ras211,212. Grb2 has also been reported to bind Shp2, an important interaction in the context of FGF signaling213; as well as Shc and IRS1, important interactions for insulin signaling through the insulin receptor214. More recently, the multiple binding domains of Grb2 have been recruited to explain the rapid activation properties of FGF receptor 2 (FGFR2). Lin et al. provided evidence to support the hypothesis that Grb2 association with FGFR2 prior to stimulation serves to 34 preform dimers and allow phosphorylation of the receptor activation loop necessary for rapid response to ligand stimulation. Once FGF is presented, FGFR2 phosphorylates Grb2 inducing dissociation and removing the inhibition215. The hypothesis presents an interesting scenario lending weight to the “preformed dimers” school over the “random association” school of receptor dimerization and activation216. 1.3.3.7 IRS2 IRS (Insulin Receptor Substrate)-proteins were discovered in the context of insulin signaling and were found to be multifunction interfaces217,218. They have subsequently been linked to the signaling of many cytokine receptors80 (including prolactin, growth hormone, leptin, vascular endothelial growth factor (VEGF), TrkB, ALK, IL-2, IL-4, IL7, IL-9, IL-13, IL-15, interferon and integrins219) and binding to various cellular proteins (including PI3K, SHP2, Fyn, Grb2, nck, and Crk220). In the case of receptor cross-talk with EGFR, this is not surprising noting the similarities between the intracellular portions of these molecules; specifically in the many C-terminal phosphorylation sites completing motifs for SH2-domain containing proteins such as the IRS-proteins221 and Shc, which is also known to interaction with both EGFR and insulin receptor80. IRS1 and IRS2 are 43% identical221. They contain a PH domain for membrane association and a PTB domain similar to that of Shc222. This similarity is particularly important in the C-terminal regions of the protein (where they share approximately 35% identity) where downstream effectors are recruited to phosphotyrosine-binding motifs. Many of these motifs are conserved between the two members and both have been reported to activate PI3K223,224 and ERK1/2 in a variety of model systems including breast cancer225,226. Despite their ubiquitous expression and structural similarities, IRS1 and IRS2 have nonredundant functions as evidenced by knockout experiments in mice. IRS1 knockout mice exhibit stunted growth and insulin resistance that does not proceed to diabetes227,228. In contrast, IRS2 knockout mice are normal size, yet show brain defects due to reduced neural proliferation227,228, develop early-onset diabetes227,229, and produce infertile females230. These differences extend into cancer models where mammary metastasis is promoted by IRS2 or the suppression of IRS1231,232. 35 1.3.3.8 Shp Shp proteins are a subclass of ‘classic’ protein tyrosine phosphatases (PTPs) whose binding localization is governed by SH2 domains233. Such ‘classic’ PTPs have a 240-250 amino acid PTP domain centered among regulatory sequences directing binding and only catalyze the dephosphorylation of phoshotyrosine. In vertebrates, there are two known Shps (Shp1 and Shp2), which have single orthologs in Drosophila (Csw) and C. elegans (Ptp-2)234. Shp1 expression is limited and highest levels are observed in hemetopoietic cells234 and recent studies have shown nuclear localization due to a basic nuclear localization sequence in place of an SH3-domain235,236. In contrast, Shp2 and orthologs are widely expressed234 and found uniformly distributed through the cytosol235. Shp proteins have been shown to bind a wide range of signaling proteins including RTKs (such as EGFR, SCFR, PDGFR, and EPOR) as well as adaptor proteins (such as Grb2, FRS2, Jak2, p85 subunit of PI3K, IRS2, and Gab1-2237), which are central to transducing extracellular information across the membrane. Although Shp1 fits the stereotypical role of reducing phosphorylation and quenching signaling, Shp2 has been implicated in diverse roles. For instance, Shp2 has been shown to help activate downstream signaling transduction by serving as an adaptor protein for Grb2 binding to RTKs238,239. Shp2 has been implicated in Erk signaling downstream of many RTKs. Without proper Shp2 involvement Erk activation is not sustained or even initiated240,241, and may link Shp2 to cell cycle progression238,239. Multiple methods of regulation of Shp activity have been hypothesized. Shp proteins contain two known tyrosine phosphorylation sites in the C-terminal tails, which are phosphorylated differentially by receptor and intracellular kinases234. The Sh2-domains have also been shown to regulate phosphatase activity242. Experimentally, phosphotyrosine containing peptides that bind the N-SH2-domain increase phosphatase activity. A structural explanation posits that in the absence of a binding site, the N-SH2domain instead binds the phosphatase domain blocking the active site243. It is less clear whether the same holds true for the C-SH2-domain and it may simply serve to increase target-binding specificity234,241. 36 1.3.3.9 Cbl Cbl (short for Casitas B-lineage lymphoma) was first discovered as part of a gag-Cbl fusion protein driving B-lineage lymphomas244. c-Cbl is a cytosolic protein ubiquitously expressed throughout the body with known homologs in mammals, worms, fish, birds and frogs. It serves as an E3 ubiquitin ligase245,246 responsible for recognizing phosphorylated proteins destined for ubiquitin-mediated degradation247. Cbl interactions have been reported with a wide range of proteins including receptors, intracellular kinases and phosphatases, adaptors, cytoskeletal proteins, as well as ubiquitin247-249. Cbl’s N-terminal portion contains a calcium-binding EF hand as well as a SH2-domain with consensus sequence (N/D)XpY(S/ T)XXP247,250,251, but c-Cbl has also been shown to interact with Met, where it binds DpYR252. The RING finger domain mediates c-Cbl’s ubiquitin ligase activity by connecting Cbl (the E3 ubiquitin ligase) to the E2 ubiquitin ligase245,253. The short linker connecting the SH2-domain and the RING finger domain has been shown important for ubiquitination, and mutations can be transforming254. Taken together, these domains provide the baseline function of Cbl: the SH2-domain recruits Cbl to a phosphorylated protein and RING facilitates ubiquitination247. Additional proline-rich domains in the C-terminus provide putative SH3-binding motifs255 and vary between isoforms and species247. The C-terminal portion of Cbl also contains sites for tyrosine phosphorylation, a leucine zipper homogy domain, and a ubiquitin-associated domain247. Initial studies showed that Bcr-Abl, v-Abl, Abl, and v-src were capable of phosphorylating Tyr-700 and Tyr-774 providing putative binding for the crkl SH2-domain256. This list expanded when Fyn was shown to phosphorylate Tyr-731 and provide a binding motif for PI3K257. These results have since been recapitulated in vitro to show that the kinase domains of Syk, Fyn, and Abl phosphoryle c-Cbl differentially258. Tyrosine phosphorylation in the linker region was hypothesized to modulate E3-activity253 and was later confirmed in the EGFR system with phosphomimetic mutagenesis259. Cbl participation in EGFR signaling has been extensively studied and serves as a model for understanding the roles that phosphatases can play in signal transduction. Cbl has been shown to ubiquitinate EGFR leading to its downregulation by endocytosis253,260. 37 Swaminathan et al. comprehensively outline this process in their review247. Briefly, activated EGFR autophosphorylates leading to pTyr-1045, which recruits Cbl to the receptor253. Cbl mono-ubiquitinates EGFR261 while it is at the plasma membrane262. Eps15 is thought to play a role in recruiting EGFR to clathrin-coated pits and adaptor proteins, including AP2263,264, which develop into clathrin-coated vesicles, and merge with internal vesicles to become multivesicular-bodies that are ultimately sorted for recycling or lysosomal degradation265. Cbl remains bound to EGFR through much of this process261,266. Mounting evidence suggests that Cbl-mediated ubiquitination is important for sorting rather than initial internalization77,266. Cbl has also been shown to be recruited to EGFR indirectly through Grb-2, as seen in EGFR Y1045F mutants that are still endocytosed and degraded77,267. Mutations to EGFR, Grb2, or Cbl to interfere with their interactions and reduces EGFR endocytosis268,269. Evidence suggests that Grb2-mediated and pTyr-1045-mediated Cbl recruitment may result in ubiquitination of unique lysine residues of EGFR77; a potential explanation for differential fate247. Although EGFR is the best-studied ubiquitin-mediated degradation target of Cbl, many other receptors and intracellular proteins have also been shown to be targets (for extensive list see Swaminathan et. al 2006). Furthermore, Cbl has been implicated as an adaptor protein in Erk activation in response to Met activation200 and G-SCF stimulation270. 1.4 Mass Spectrometry Mass spectrometry is an attractive technology for proteomic analyses, because it allows for simultaneous data to be collected for hundreds or thousands of proteins from a single biological sample. Most applications digest proteins into peptides with sequence specific endopeptidases, such as trypsin, to produce predictable subsequences that can be matched back to the contributing protein by way of database comparison to a species’ proteome. With proper purification techniques, peptides containing particular post-translational modifications (PTMs) such as phosphorylation or acetylation can be isolated and analyzed to determine signature profiles following differential treatment271. The most successful sample introduction method for peptide MS experiments has been liquid chromatography (LC), which provides real-time separation of peptides to simplify mixtures. 38 1.4.1 MSMS Peptide Analysis During a liquid chromatography mass spectrometry (LCMS) experiment, peptides are ordered based on their elution from a column. At any one time, only a small fraction of all peptides will be introduced to the instrument. The complete peptide is charged by a voltage source connected through the liquid supply (electrospray ionization) and accelerated through a potential gradient before arriving at a detector. In the full scan mass spectrum (MS1), the mass-to-charge ratio (m/z) and the abundance of each peptide are recorded. If a proper ion series is observed and the intensity of the signal surpasses a threshold, the ion may be selected for fragmentation by collision with an inert gas. The resulting fragments are collected and again analyzed to generate an MS/MS spectrum (MS2)272. Abundant fragment ions from the isolated precursor will produce peaks that can then be compared against lists of predicted fragments for peptides of known sequence. 1.4.2 Sample Purification (IP/IMAC) Mass spectrometry-based peptide identification and quantification is greatly aided by sample purity. During a chromatographic elution, peptide identification requires that a single sequence be isolated to ensure contributed fragments can be properly attributed. Of even greater concern is peptide quantification, especially by isotopic labeling techniques such as iTRAQ. Since the fragment ions used for quantification are identical for all peptides, it is critical that the donor sequence be the only one producing them. Liquid chromatography is very helpful is achieving this goal by reducing local sample complexity. Any additional pre-analysis purification to simplify the mixture of peptides will provide additional certainty to quantification. Important to the work in this thesis are immunoprecipitation (IP) and immobilized metal affinity chromatography (IMAC) utilized to isolate tyrosine phosphorylated peptides. 1.4.2.1 Immunoprecipitation Immunoprecipitation (IP) is a very useful technique to reduce the complexity of samples derived from whole cell lysates. At the protein level, an antibody can be used to purify a sample to investigate changes that occur to the post-translational modification at all sites of a particular protein. At the peptide level, specific post-translational modifications can be targeted to investigate changes in a particular signaling moiety across many proteins. 39 For the work in this thesis, phosphotyrosine-specific IPs were used to query signaling downstream of EGFR, because EGFR contains a tyrosine kinase domain, as do several of the proteins which it putatively activates. Phosphotyrosine antibodies are generally more specific than phosphoserine or phosphothreonine, because the aryl side-chain of tyrosine lends itself well as a comparatively unique antigen. Employing a cocktail of multiple phosphotyrosine-specific antibodies ensures increased coverage of surrounding sequences. 1.4.2.2 Immobilized Metal Affinity Chromatography The phosphotyrosine IP does not clean up the sample completely, because phosphotyrosine-containing peptides are often orders of magnitude less abundant than peptides derived from highly expressed cytoskeletal and metabolic proteins. An ironbased immobilized metal affinity chromatography (IMAC) step further purifies the peptides collected by the IP prior to LC-MS analysis. 1.4.3 Instrument Types Several types of mass analyzers are commonly used for proteomic mass spectrometry. Analyzer designs must balance sensitivity, analysis time, reliability, scan information, and dynamic range. Each has its strengths and weaknesses making them best suited to specific proteomic applications. There is much to be said about instrument differences, and there are great differences in results from various implementations of similar systems. In 2009, Bell et al. distributed identical protein digest samples to 27 labs and compared their reports 273. Most fragments were found in the raw data when analyzed centrally, but variations in reports were due to different databases and contamination. 1.4.3.1 TOF A time of flight (TOF) mass analyzer relies on the principle that more massive particles will move more slowly when given the same energy. A batch of ions in a TOF’s trap is suddenly subjected to a voltage difference accelerating all of the ions with an energy based on their charge. The ratio of the velocities of two identically charged ions with masses m1 and m2 will be proportional to v22/v12, which make the arrival time at the detector a direct readout for the mass-to-charge ratio of the ion. 40 The TOF analyzer has several benefits. The first is that it is high accuracy as the time resolution of the detector can be tuned to great precision. Second, the system is very fast at measuring all ions from a single batch. The time of flight of the highest mass to charge ratio ion dictates the analysis time. Third, TOF analyzers are very sensitive274. Finally, the nature of the data collection method means that full scan information is available for both MS1 and MS2. The TOF analyzer’s accuracy depends on ions’ traversal of a known path length. Environmental conditions that change the tube length can rapidly degrade the measurement accuracy. Over the course of 24 hours of inactivity, the accuracy may drift by 10’s of ppm. The newest TOFs on the market require daily calibration prior to use to account for small changes in room temperature. In addition to calibration, TOFs are not immune to the dynamic range concerns that plague all trap-based instruments274. To prevent accuracy drift during continuous use, the instruments may continuously tune themselves based on peaks in the sample. 1.4.3.2 Orbitrap The orbitrap is the newest of the mass analyzer types and is the brainchild of Alexander Makarov 275. The oscillation time along the length of the electrode is detected and matched to a specific m/z in an analogous way to how a Fourier transform ion cyclotron resonance (FT-ICR) instrument would measure orbit times. The result is a very high resolution measurement over a large mass range276. These qualities have made it the tool of choice for many proteomics endeavors. As with TOF analyzers, the orbitrap provides full scan information for peptide fragments permitting high confidence identifications and de novo sequencing of novel peptides. The high resolution of these instruments increases certainty in identifications, as well as improves label-based quantification strategies such as iTRAQ. iTRAQ fragment peaks were chosen to fall in a desolate region of the m/z range, but contaminant ions still hinder quantification in some cases. The 116.11 peak of both commercially available iTRAQ kits can be infringed upon by a sequence-specific peak at 116.07; a difference that is easily seen by the orbitrap analyzer permitting clean quantification of this label277. 41 The first drawback to FT-based instruments such as the orbitrap is that they are slower than their TOF-based counterparts. The same cycling time that increases the resolution of any one scan also means that fewer scans can be performed. The newest generation orbitrap instruments have decreased the size of the analyzer permitting faster cycling and analysis. The second drawback is due to the trap-based nature of the ion collection. Since the batch-size of ions is fixed, lower abundance ions are less likely to contribute sufficient signal hurting the overall in-scan dynamic range. 1.4.3.2.1 Theoretical Background The orbitrap mass analyzer uses a complex potential consisting of a quadrupole field superimposed with a logarithmic field. 𝑈 𝑟, 𝑧 = 𝑘 ! 𝑟! 𝑘 𝑟 𝑧 − − 𝑅! ! 𝑙𝑛 +𝐶 2 2 2 𝑅! where 𝑅! is a characteristic radius below which ions can be trapped. This potential is achieved by a very careful choice of electrode geometries specified by: 𝑧! 𝑟 = 𝑟 ! 𝑅! ! 𝑅! − + 𝑅! ! 𝑙𝑛 2 2 𝑟 The equations of motion resulting from this geometry are quite complex and the radial and angular components cannot be solved analytically. However, the motion along the zaxis results in a simple harmonic oscillator equation for which an exact solution is known: 𝑧 𝑡 = 𝑧! cos 𝜔𝑡 + 2𝐸! sin 𝜔𝑡 𝑘 where 𝐸! is the energy along the z-axis, and the fundamental frequency, 𝜔= 42 𝑘𝑞 𝑚 The remarkable part of the trap design is that the oscillations in the axial oscillation frequency now provides a direct measure of the mass to charge ratio275. 1.4.3.3 Triple Quadrupole A triple quadrupole (TQ) instrument is quite different from both TOFs and orbitraps. During an MS1 analysis, the first quadrupole (Q1) is used to isolate one precursor window at a time so that its intensity can be measured. During MS2 analysis, Q1 is tuned to pass a single precursor window into Q2 for fragmentation. A single m/z window for the fragments is then isolated in Q3 and the intensity is measured. This pair of Q1 isolation window and Q3 isolation window is called a “transition”. The TQ instruments are tailored to applications where dynamic range is of the utmost importance. Since batches of ions are never held statically in a trap, stochastic variation between batches is never produced. Instead a continuous stream is passed through the quadrupoles for a fixed amount of time and the intensity recorded. In this way, the abundance of other ions outside the isolated m/z windows do not affect the intensity of the recorded signal except in competition for charge. This feature is ideal for confirming the presence and signal intensity of a handful of known peptides in a complex sample278. The drawback of the TQ system arises from the use of the quadrupole itself. In practice, the quadrupole is tuned to pass a very narrow range of ions. To produce a full scan mass spectrum the quadrupole must sequentially select narrow mass windows across the entire mass range. The result is a very slow process with poor duty cycle time, which wastes the majority of ions of any particular m/z. This is in stark contrast to both TOF instruments and orbitraps, which simultaneously analyze all m/z values. For these reasons, scanning quadrupoles are rarely used to collect full scan mass spectra. Instead, they can be used in series to very accurately isolate particular windows in the precursor and fragment scans. Operating in this way is known as single reaction monitoring (SRM), or multiple reaction monitoring (MRM) when many such pairs are sequentially collected. 1.4.3.3.1 Theoretical Background The equations of motion for a charged particle in a quadrupole field are the result of a force balance and can be derived as follows: 43 𝑞𝐸 = 𝐹 = 𝑚𝑎 𝑎 = 𝑞𝐸 𝑞 𝜕𝛷 = 𝑚 𝑚 𝜕𝑥! The electric potential inside the hyperbolic quadrupole field (Figure 1-­‐1a) can be expressed as 𝛷 = 𝛷! 𝑥! − 𝑦! 𝑟! ! where 2𝑟! is the separation of the rods of the quadrupole. Plugging this in to the force balance expression yields, 𝜕!𝑥 2𝑞 = 𝛷 𝑥 ! 𝜕𝑡 𝑚𝑟! ! ! 𝜕!𝑦 2𝑞 =− 𝛷 𝑦 ! 𝜕𝑡 𝑚𝑟! ! ! There are two components to the potential 𝛷! ; and unchanging portion (U) and a variable portion (V), 𝛷! = 𝑈 + 𝑉 cos 𝜔𝑡 If the constant portion of the potential is zero (𝑈 = 0), then all ions above a critical mass are stable in the field. The critical mass is a function of the frequency of the field’s oscillation and represents those ions that are so small that they reach the rods before the field has switched polarity. When operated in this manner, the quadrupole becomes an ion trap. The constant portion of the potential drives positive ions away from the positive rods and towards the negative rods (Figure 1-­‐1b). In the absence of a variable component to the field, no positive ions resist this force and eventually all will crash into the negative rods. However, the superposition of a variable voltage can rescue some ions from collisions with the rods. If the magnitude of the variable voltage is larger than the unchanging voltage (𝑉 > 𝑈), then some ions will be stabilized. To picture the field under this 44 condition, imagine the equipotential lines (Figure 1-­‐1a) rotating. In the x-direction the baseline potential of the rods is positive, driving all positive ions away for most of the cycle. During the window of time when the net potential in the x-direction is negative, smaller positive ions will have enough time to accelerate and crash into the rods before the net potential switches back to positive. In this way, the x-direction only permits masses above a critical value to remain stable, making it a high-pass mass filter. In the ydirection, the baseline potential of the rods is negative, attracting all positive ions for most of the cycle. During the window of time when the net potential in the y-direction is positive smaller positive ions with less inertia will be more effectively repelled, leaving the larger positive ions to crash into the rods. Hence, the y-direction only permits ions below a critical value to remain stable making it a low pass-filter. It is the combination of the high-pass x-direction and low-pass y-direction that provide the filtering capability to the quadrupole. The final piece of the quadrupole used in practice is the field applied in the z-direction. The length of the quadrupole and the strength of the field applied determines the transit time for ions to experience the x-y fields already described. To trap positive ions in the quadrupole, both ends are supplied a positive potential. During use as a mass filter in a triple quadrupole instrument, there is a decrease in potential applied in the z-direction to accelerate positive ions279. a)# b)# Figure 1-­‐1: Quadrupole Potential and Ion Dyanamics. a) Equipotential lines for a quadrupole electric field. Hyperbolic rods produce hyperbolic equipotential lines. b) The path of a positively charged ion is a quadrupole field. 45 In practice the quadrupole is operated at a range of U and V values. For any particular m/z value the stable region looks like the one shown in Figure 1-­‐2a, with the parameter values of interest being those which produce both x-stability and y-stability. This overlap region is a function of the m/z and can be visualized across a range (Figure 1-­‐2b). To produce a full scan mass spectrum a scan consists of a range of U-V values queried in series. The slope of the U-V scan path determines the resolution; as the slope increases it intersects a smaller section of the stable region of each mass effectively reducing the window during which any one mass is stable resulting in improved resolution. Unfortunately, this also decreases the signal intensity. Tracing along a linear U-V scan path produces a mass spectrum at a specific resolution (Figure 1-­‐2c). a)" b)" U" U" R1" x%direc+on"stable" " " " " " " m1" m2" m3" V" V" Intensity" c)" y%direc+on"stable" R2" R3" m1" m2" m3" R3" R2" R1" m/z" Figure 1-­‐2: a) Stable regions of U-­‐V space for a particular m/z. The overlap of x-­‐direction stable and y-­‐ direction stable gives the U-­‐V pairs that can adequately isolate the m/z of interest. b) The overlap stable region for three unique masses for U > 0. Lines represent scan path in U-­‐V settings to produce full spectra of varied resolution (R). c) Mass spectra produced at each resolution setting (R). 46 1.4.4 Analysis Types When designing a mass spectrometry-based workflow, the type of data to be collected will be determined by the ultimate goal of the analysis. Many studies seek to identify as many proteins 280 or sites of post-translational modification as possible281, for which full scan information is most useful. More recently, in depth follow-up studies to tease out mechanism and information flow through already identified PTM’s are becoming more popular282. It is these targeted quantitative studies that will benefit the most from multiple reaction monitoring (MRM)283. 1.4.4.1 Data Dependent Tandem Mass Spectrometry In a typical full scan MS2 experiment target precursors are selected by their relative abundance in the MS1 scan. Upon fragmentation, all ions produced are collected and analyzed so that the entirety of the MS2 spectra for selected precursor is recorded. Since target selection is based on abundance, user bias towards particular proteins and pathways is removed inspiring databases and statistical methods to speed up data processing and identification. With the full scan information in hand the user has more options on how to proceed. The MS2 peaks can be sequenced de novo or matched to those predicted for peptides in a database, drastically reducing the time required to process data. Database search methods have also been used to calculate false discovery rates (FDRs) by comparing results for true databases and random ones 8, along with other more inventive statistical and machine learning-based approaches to dealing with the wealth of information provided284,285. The sub-discipline of proteomics focused on identifying sites of post-translational modification would not be possible without full scan information and the aforementioned database processing. Side-chains of key residues may be modified, thus changing the mass of the peptide and its fragments in a predictable manner. Many sites of modification can then be identified from the same sample and their biological significance investigated. The tradeoffs that must be made to attain this full scan information are not trivial. First and foremost, the instruments used for these analyses are prohibitively expensive. Few 47 academic labs have the resources to lease or buy the high-end trap-based instruments used in cutting edge proteomics. On a computational note, the file size associated with full-scan data collection methods is large. It is not uncommon for 2-3 hour experiments to generate files in the hundreds of Mb or Gb range. Parsing and processing these large files requires more computing resources and time. In addition, the number of potential identifications produced in discovery runs leads many groups to avoid the time-consuming step of thorough validation of each individual identification instead opting for dataset scale statistics like false discovery rate (FDR)286. While this approach may prove sufficient for large-scale statistical modeling, it provides concern for single-hit follow-up biological studies based on the findings. The final drawback of full scan analyses run on trap-based instruments is the issue of dynamic range, especially in regard to quantitative data collection methods aimed at comparing signals of drastically different scale. Consider an isotopically-labeled sample where the abundance of a particular species is orders of magnitude larger in one condition as compared to the rest. Peptides derived from the abundant condition will make up an overwhelming majority of the peptide pool passed to the trap. The remaining conditions may be so low abundance that they only contribute a handful of ions, which is problematic if the amount is so small that it is subject to significant stochastic variation. 1.4.4.2 MRM Multiple reaction monitoring (MRM) differs from full scan data collection in that only very specific data is collected. Usually, the goal of an MRM experiment is to confirm the presence of a sufficient quantity of a small set of targets; which is why MRM finds a perfect niche examining athlete blood samples for illegal compounds287. In an MRM analysis, instead of collecting full MS2 sequence data for any abundant signal, a diagnostic set of precursor/fragment mass pairs, termed “transitions”, are selected and tracked throughout the course of data collection278,288. Sufficient signal on the transitions derived from a particular target is taken as evidence of the target’s presence in the sample. 48 One of the largest benefits to MRM experimental pipelines is reproducibility of targets with recovery rates as high as 99% when combined across multiple analyses283, which facilitates straightforward data collection and analysis. Once a vetted set of transitions has been developed the analysis of additional samples is quite simple. Data processing consists of scanning the resulting data for sufficient signal on each of the desired targets. The file size is generally small as only a very small amount of data regarding the intensity of key transitions is stored. The dynamic range permitted by an MRM analysis on a triple quadrupole (TQ) mass spectrometer is considerably larger than trap-based instruments, reported as high as 105 278 . At no time during the ion’s lifetime in the instrument is it stored in a pool of other ions, meaning that the reported intensity of any ion is dependent on its abundance in the sample (as well as its ionization potential), which is critically important for low abundance species. A TQ accomplishes this by measuring ions in a constant stream for fixed periods of time. The first quadrupole (Q1) is tuned to isolate the target precursor. The second quadrupole (Q2) is used as a collision chamber to fragment the single precursor stream produced by Q1. The third quandrapole (Q3), isolates a target fragment mass from the output of Q2 and directs it to the detector 288. Once data has been collected, the system shifts to the next transition. The ability of the MRM system to focus on specific transitions also has its drawbacks, the first of which is the speed of acquisition. Due to the fact that the quadrupoles must be specifically tuned to measure each transition, the duty cycle increases very quickly278. The duty cycle is the combined time spent scanning all transitions in the target list before returning to scan again. When coupled to liquid chromatography, peaks corresponding to targets may only elute for short periods. To ensure multiple collections during the elution peak, the duty cycle must be minimized while still permitting adequate collection for each transition. This constraint limits the number of transitions that may realistically be targeted in a single analysis. For complex samples, the list of desired targets may quickly exceed the capabilities of the instrument. In addition, full MS2 sequence information is not collected. This limitation poses a problem for confirming the identity of peaks in a transition’s elution profile. Multiple 49 species in the sample may produce a similar fragment mass from a similar precursor mass. Such co-occurrences make it difficult to decide which peak corresponds to the species of interest. This problem is mitigated by tracking coelution of multiple fragments from the same precursor288 at the expense of increasing the duty cycle. To further increase certainty of chromatographic peak identification, heavy-labeled standards are often included to prove coelution278. 1.4.5 Labeling Techniques The standard LC-MS protocol does not allow for concurrent comparison between samples. Peak intensities are often recorded and compared between analyses, but are subject to complicating factors (not limited to: altered chromatography, sample specific processing losses, instrument cleanliness, etc.). Instead, several methods have been developed to allow combination of multiple samples into a single analysis for better direct comparison. 1.4.5.1 Covalent Isotopic Labeling (iTRAQ, TMT) Covalent isotopic labeling techniques (such as iTRAQ289 and TMT290) provide one method to directly analyze multiple samples simultaneously291. Both strategies rely upon attaching a unique tag to peptides derived from each sample. Each sample-specific tag belongs to a family of reagents, which add identical mass to all peptides from all samples. However, upon fragmentation, each tag of the family produces a distinct fragment mass depending on the balance of heavy/light ions contained in the jettisoned fragment. The abundance of each member of the tag family reflects the relative abundance of the peptide in the corresponding sample. One of the major drawbacks to using fragmentbased quantification techniques is the possibility for contamination. If multiple peptides are included in the MS1 isolation window, then the corresponding quantification will be their combination. In many cases, the contamination is a constant amount adding to all signals equally, which reduces the relative ratio observed292. 1.4.5.2 SILAC Stable isotopic labeling in cell culture (SILAC) is an alternative labeling technique which installs isotopic signatures while cells are being grown293. Heavy-labeled versions of essential amino acids are substituted into the media of one treatment condition. The 50 sample ultimately derived from these cells will have incorporated heavy-labeled amino acids into all of its proteins. Combining lysate from a heavy-labeled culture and a lightlabeled culture allows the contribution of each to be determined at the MS1 level based on the differential precursor mass. The benefit to SILAC-based quantification is the early stage in the processing pipeline at which samples are combined, thus mitigating sample specific processing losses. The drawback is the upfront cost to properly label cells by growing them in expensive doped-isotope-containing media. 1.5 Absolute Site-­‐Specific Quantification 1.5.1 Motivation for Absolute Quantification When trying to understand a signal-processing network, the most informative quantity is the absolute amount of peptide present under different sample conditions. This quantity is well suited for a wide range of mechanistic and statistical modeling, because of the wealth of comparisons it permits between nodes in the network. Not only can the relative amount of a signal be compared to other treatment conditions within the same analysis, but also between multiple analyses. Inter-experiment comparison of a single signal alone is not sufficient motivation for absolute quantification, as a properly utilized reference sample can recapitulate the same information. Absolute quantification is indispensible when comparing between multiple signals across many biological conditions. The technical issue is that measured signal intensity does not provide the absolute amount of the peptide. Many sources of peptide-specific losses exist throughout the sample processing procedure and ultimately contribute to sequence-specific signal intensity/amount conversions. Some of these complications include: peptide ionizability, losses during sample processing, and differential affinities for immunoprecipitation antibodies and metal ions. For these and other reasons, sequence-specific calibration must be employed to provide absolute quantification data to enhance modeling efforts. 1.5.2 Previous Methods All absolute quantification techniques are based on comparing the integrated area under the signal intensity trace corresponding to specific ions. Some work in the MS1 by collecting ion intensity information for the intact endogenous peptide for comparison to that of the standard peptide. Others that involve isotopic labeling rely upon comparison 51 between fragment ions correlated to peptide abundance. Each has their associated strengths and weaknesses. 1.5.2.1 Single Point Internal Methods Single point internal methods include a known amount of a standard into the sample. Differential labeling is required to avoid obscuring the endogenous signal and can be accomplished by a number of methods. The added internal standard can be heavy-labeled so as to shift the mass and avoid conflict with the endogenous version. Alternatively, the standard peptides can be distinctly labeled with an isotopic tag. 1.5.2.1.1 iTRAQ Internal Standard In the seminal paper for the iTRAQ labeling reagent289, Ross and Pappin proposed that their new technique could be used not only for relative but also absolute quantification. They envisioned reserving a single tag of the family to label a known amount of a peptide assumed to be present in the samples labeled with the remaining tags. This way, the amount of peptide present in each sample could be calculated by way of relative peak intensity resulting from the tags. The strength of this method over traditional area under the curve (AUC) calibration is that the standard peptide is partially in the sample context. It is analyzed simultaneously with the samples allowing compensation for run-specific variation in chromatography or charge availability. Any peptide specific losses that occur post-labeling will be accounted for by corresponding losses of the standard. Additionally, since the iTRAQ labeling technique is being used, the standard peptide can be synthesized using light amino acids instead of heavy-labeled ones necessary for other related calibration techniques. By including the isotopically-labeled peptide in the sample, the identity can further be confirmed by coelution of the standard endogenous peptides. The major weakness of this method is that it is a single point calibration. A single amount of the standard peptide much be chosen a priori to be as close as possible to the expected endogenous amounts. If the expected amount is drastically different or varies widely between samples then the quality of calibration will suffer at the extremes. A partial remedy to this problem is to reserve more channels of the iTRAQ experiment for 52 standard peptides at different concentrations; a losing proposition as fewer samples may be analyzed simultaneously. 1.5.2.1.2 Heavy-­‐Labeled Heavy-labeled peptide standards match the sequence and post-translational modification state of the peptide of interest. The only change is that a residue of the sequence has had its C12 replaced with C13, N14 with N15, O16 with O18, and/or H1 with H2. The heavylabeled standard peptide now has a very slight mass increase that shifts it beyond the isotope series of its endogenous counterpart. A known amount of the standard can then be included in the sample prior to any processing without fear of contaminating the ultimate endogenous signal intensity. There are a number of benefits to using a heavy-labeled standard approach for absolute quantification. The first benefit is that it may be included prior to any sample processing and will experience any sequence specific losses other than proteolytic enzyme cleavage (which has been shown to be insignificant under the proper digestion conditions12). Since the peptide being included is heavy-labeled it is already distinguishable from the endogenous version and no further labeling technique is required. Third, the endogenous and standard peptides share the same amino acid sequence, so they interact with the chromatographic packing material identically. Hence, coelution of the two is expected and can be used as further verification of endogenous peptide identity. The heavy-labeled peptide standard technique suffers from the same single-point calibration concerns as the covalent modification techniques. As before, there is a way to circumvent them: include more than one type of heavy-labeled standard at measured amounts. The multiple-point calibration can then be carried out in the MS1 scan within the dynamic range limits of trap-based instruments. The issue is the production cost of heavy-labeled standards, which are orders of magnitude more expensive to synthesize than regular peptides due to the purification necessary to ensure fully C13-incorporated amino acids. 1.5.2.2 External Standard Using an external standard is the most straightforward way to do absolute quantification. Known amounts of synthetic peptide are loaded and analyzed. This technique is widely 53 used for Amino Acid Analysis (AAA) where peptides are hydrolyzed so that their constituent amino acids can be measured against calibrated against signal intensity from known amounts. Success of the methodology to measure intact peptides has been established294. The major strength of this method is that it is a multiple-point calibration. Amounts spanning many orders of magnitude can be measured to better understand the relationship between signal intensity and amount. The noise floor and the saturation can be individually determined for each peptide. Moreover, multiple-point calibration is achieved without the need for expensive isotopic labeling reagents or isotopically pure amino acid preparations. The technical drawbacks to using the external standard intensities for absolute quantification arise from the lack of sample context. When analyzing complex biological samples many factors may influence a peptide’s signal intensity. Competition for charge with coeluting peptides, cleanliness of the instrument, and chromatographic variations may all influence the observed intensity and introduce error in calibration. The other consideration with external calibration techniques is the total experimental time required. Before absolute data can be collected, a standard curve for each peptide of interest must be produced. Some amount of combination processing is available in terms of measuring multiple peptide sequences in parallel. However, each amount must be completed in a separate analysis to properly determine its individual contribution. In its typical manifestation, the external standard calibration method is used to measure a single sample, which means that examining a wide range of experimental conditions in series will thus take considerably more time than multiplex labeling techniques. 54 1.6 Bibliography 1. Taylor, S. S. & Kornev, A. P. Protein kinases: evolution of dynamic regulatory proteins. Trends Biochem. Sci. 36, 65–77 (2011). 2. Ozlu, N. et al. Phosphoproteomics. Wiley Interdiscip Rev Syst Biol Med 2, 255– 276 (2010). 3. Pawson, T. & Nash, P. Protein-protein interactions define specificity in signal transduction. Genes Dev. 14, 1027–1047 (2000). 4. Roskoski, R. ERK1/2 MAP kinases: structure, function, and regulation. Pharmacol. Res. 66, 105–143 (2012). 5. Wheeler, D. L., Iida, M. & Dunn, E. F. The role of Src in solid tumors. Oncologist 14, 667–678 (2009). 6. Gan, H. K., Cvrljevic, A. N. & Johns, T. G. The epidermal growth factor receptor variant III (EGFRvIII): where wild things are altered. FEBS J. 280, 5350–5370 (2013). 7. Naegle, K. M. et al. PTMScout, a Web resource for analysis of high throughput post-translational proteomics studies. Molecular & Cellular Proteomics 9, 2558– 2570 (2010). 8. Perkins, D. N., Pappin, D. J., Creasy, D. M. & Cottrell, J. S. Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20, 3551–3567 (1999). 9. Yates, J. R., Eng, J. K., McCormack, A. L. & Schieltz, D. Method to correlate tandem mass spectra of modified peptides to amino acid sequences in the protein database. Anal. Chem. 67, 1426–1436 (1995). 10. Curran, T. G., Bryson, B. D., Reigelhaupt, M., Johnson, H. & White, F. M. Computer aided manual validation of mass spectrometry-based proteomic data. Methods 61, 219–226 (2013). 11. Chen, W. W. et al. Input-output behavior of ErbB signaling pathways as revealed by a mass action model trained against dynamic data. Mol Syst Biol 5, 239 (2009). 12. Gerber, S. A., Rush, J., Stemman, O., Kirschner, M. W. & Gygi, S. P. Absolute quantification of proteins and phosphoproteins from cell lysates by tandem MS. 55 Proc. Natl. Acad. Sci. U.S.A. 100, 6940–6945 (2003). 13. Schmidt, A. et al. Absolute quantification of microbial proteomes at different states by directed mass spectrometry. Mol Syst Biol 7, 510 (2011). 14. Ferlay, J. et al. Estimates of worldwide burden of cancer in 2008: GLOBOCAN 2008. Int. J. Cancer 127, 2893–2917 (2010). 15. Jemal, A. et al. Global cancer statistics. CA: A Cancer Journal for Clinicians 61, 69–90 (2011). 16. Hanahan, D. & Weinberg, R. A. The hallmarks of cancer. Cell 100, 57–70 (2000). 17. Hanahan, D. & Weinberg, R. A. Hallmarks of Cancer: The Next Generation. Cell 144, 646–674 (2011). 18. Ferlay, J., Héry, C., Autier, P. & Sankaranarayanan, R. in 1–19 (Springer New York, 2009). doi:10.1007/978-1-4419-0685-4_1 19. Holliday, D. L. & Speirs, V. Choosing the right cell line for breast cancer research. Breast Cancer Res. 13, 215 (2011). 20. Neve, R. M. et al. A collection of breast cancer cell lines for the study of functionally distinct cancer subtypes. Cancer Cell 10, 515–527 (2006). 21. Fridlyand, J. et al. Breast tumor copy number aberration phenotypes and genomic instability. BMC Cancer 6, 96 (2006). 22. Perou, C. M. et al. Distinctive gene expression patterns in human mammary epithelial cells and breast cancers. Proc. Natl. Acad. Sci. U.S.A. 96, 9212–9217 (1999). 23. Soule, H. D. et al. Isolation and characterization of a spontaneously immortalized human breast epithelial cell line, MCF-10. Cancer Research 50, 6075–6086 (1990). 24. Beerli, R. R. & Hynes, N. E. Epidermal growth factor-related peptides activate distinct subsets of ErbB receptors and differ in their biological activities. The Journal of Biological Chemistry 271, 6071–6076 (1996). 25. Tsutsui, S., Ohno, S., Murakami, S., Hachitanda, Y. & Oda, S. Prognostic value of epidermal growth factor receptor (EGFR) and its relationship to the estrogen receptor status in 1029 patients with breast cancer. Breast Cancer Res. Treat. 71, 56 67–75 (2002). 26. Ferrero, J. M. et al. Epidermal growth factor receptor expression in 780 breast cancer patients: a reappraisal of the prognostic value based on an eight-year median follow-up. Ann. Oncol. 12, 841–846 (2001). 27. Rampaul, R. S. et al. Epidermal growth factor receptor status in operable invasive breast cancer: is it of any prognostic value? Clin. Cancer Res. 10, 2578 (2004). 28. Bhargava, R. et al. EGFR gene amplification in breast cancer: correlation with epidermal growth factor receptor mRNA and protein expression and HER-2 status and absence of EGFR-activating mutations. Mod Pathol 18, 1027–1033 (2005). 29. Rimawi, M. F. et al. Epidermal growth factor receptor expression in breast cancer association with biologic phenotype and clinical outcomes. Cancer 116, 1234–1242 (2010). 30. Ciardiello, F. Epidermal growth factor receptor tyrosine kinase inhibitors as anticancer agents. Drugs 60 Suppl 1, 25–32– discussion 41–2 (2000). 31. Mendelsohn, J. & Baselga, J. The EGF receptor family as targets for cancer therapy. Oncogene 19, 6550–6565 (2000). 32. Agrawal, A. Overview of tyrosine kinase inhibitors in clinical breast cancer. Endocr. Relat. Cancer 12, S135–S144 (2005). 33. Wakeling, A. E. et al. ZD1839 (Iressa): An Orally Active Inhibitor of Epidermal Growth Factor Signaling with Potential for Cancer Therapy. Cancer Research 62, 5749–5754 (2002). 34. Knowlden, J. M. et al. Elevated levels of epidermal growth factor receptor/cerbB2 heterodimers mediate an autocrine growth regulatory pathway in tamoxifen-resistant MCF-7 cells. Endocrinology 144, 1032–1044 (2003). 35. Rusnak, D. W. et al. The Effects of the Novel, Reversible Epidermal Growth Factor Receptor/ErbB-2 Tyrosine Kinase Inhibitor, GW2016, on the Growth of Human Normal and Tumor-derived Cell Lines. Molecular Cancer Therapeutics 85–94 (2001). 36. Cohen, S. Isolation of a Mouse Submaxillary Gland Protein Accelerating Incisor 57 Eruption and Eyelid Opening inthe New-born Animal*. The Journal of Biological Chemistry 237, 1555–1562 (1962). 37. King, L. E., Carpenter, G. & Cohen, S. Characterization by electrophoresis of epidermal growth factor stimulated phosphorylation using A-431 membranes. Biochemistry 19, 1524–1528 (1980). 38. Carpenter, G., King, L. & Cohen, S. Epidermal growth factor stimulates phosphorylation in membrane preparations in vitro. Nature 276, 409–410 (1978). 39. Ushiro, H. & Cohen, S. Identification of phosphotyrosine as a product of epidermal growth factor-activated protein kinase in A-431 cell membranes. The Journal of Biological Chemistry 255, 8363–8365 (1980). 40. Ullrich, A. et al. Human epidermal growth factor receptor cDNA sequence and aberrant expression of the amplified gene in A431 epidermoid carcinoma cells. Nature 309, 418–425 (1984). 41. Downward, J. et al. Close similarity of epidermal growth factor receptor and verb-B oncogene protein sequences. Nature 307, 521–527 (1984). 42. Downward, J., Parker, P. & Waterfield, M. D. Autophosphorylation sites on the epidermal growth factor receptor. Nature 311, 483–485 (1984). 43. Honnegger, A. M., Kris, M. G., Ullrich, A. & Schlessinger, J. Evidence that autophosphorylation of solubilized receptors for epidermal growth factor is mediated by intermolecular cross-phosphorylation. 1–5 (1989). 44. Carpenter, G. & Cohen, S. Epidermal growth factor. The Journal of Biological Chemistry 265, 7709–7712 (1990). 45. Heath, W. F. & Merrifield, R. B. A synthetic approach to structure-function relationships in the murine epidermal growth factor molecule. Proc. Natl. Acad. Sci. U.S.A. 83, 6367–6371 (1986). 46. Tam, J. P., Sheikh, M. A., Solomon, D. S. & Ossowski, L. Efficient synthesis of human type alpha transforming growth factor: its physical and biological characterization. Proc. Natl. Acad. Sci. U.S.A. 83, 8082–8086 (1986). 47. Cooke, R. M. et al. The solution structure of human epidermal growth factor. Nature 327, 339–341 (1987). 48. 58 Montelione, G. T., Wüthrich, K., Nice, E. C., Burgess, A. W. & Scheraga, H. A. Solution structure of murine epidermal growth factor: determination of the polypeptide backbone chain-fold by nuclear magnetic resonance and distance geometry. Proc. Natl. Acad. Sci. U.S.A. 84, 5226–5230 (1987). 49. Montelione, G. T. et al. Sequence-specific 1H-NMR assignments and identification of two small antiparallel beta-sheets in the solution structure of recombinant human transforming growth factor alpha. Proc. Natl. Acad. Sci. U.S.A. 86, 1519–1523 (1989). 50. Tappin, M. J., Cooke, R. M., Fitton, J. E. & Campbell, I. D. A high-resolution 1H-NMR study of human transforming growth factor alpha. Structure and pHdependent conformational interconversion. Eur. J. Biochem. 179, 629–637 (1989). 51. Klein, P., Mattoon, D., Lemmon, M. A. & Schlessinger, J. A structure-based model for ligand binding and dimerization of EGF receptors. Proc. Natl. Acad. Sci. U.S.A. 101, 929–934 (2004). 52. Macdonald, J. L. & Pike, L. J. Heterogeneity in EGF-binding affinities arises from negative cooperativity in an aggregating system. Proc. Natl. Acad. Sci. U.S.A. 105, 112–117 (2008). 53. Alvarado, D., Klein, D. E. & Lemmon, M. A. Structural Basis for Negative Cooperativity in Growth Factor Binding to an EGF Receptor. Cell 142, 568–579 (2010). 54. Bargmann, C. I., Hung, M. C. & Weinberg, R. A. Multiple independent activations of the neu oncogene by a point mutation altering the transmembrane domain of p185. Cell 45, 649–657 (1986). 55. Lu, C. et al. Structural evidence for loose linkage between ligand binding and kinase activation in the epidermal growth factor receptor. Mol. Cell. Biol. 30, 5432–5443 (2010). 56. Scheck, R. A., Lowder, M. A., Appelbaum, J. S. & Schepartz, A. Bipartite Tetracysteine Display Reveals Allosteric Control of Ligand-Specific EGFR Activation. ACS Chem. Biol. 7, 1367–1376 (2012). 57. Woo, D. D. et al. Differential phosphorylation of the progesterone receptor by insulin, epidermal growth factor, and platelet-derived growth factor receptor 59 tyrosine protein kinases. The Journal of Biological Chemistry 261, 460–467 (1986). 58. Lemmon, M. A. & Schlessinger, J. Cell signaling by receptor tyrosine kinases. Cell 141, 1117–1134 (2010). 59. Garrett, T. P. J. et al. Crystal structure of a truncated epidermal growth factor receptor extracellular domain bound to transforming growth factor alpha. Cell 110, 763–773 (2002). 60. Schlessinger, J. Ligand-induced, receptor-mediated dimerization and activation of EGF receptor. Cell 110, 669–672 (2002). 61. Burgess, A. W. et al. An open-and-shut case? Recent insights into the activation of EGF/ErbB receptors. Molecular Cell 12, 541–552 (2003). 62. Ferguson, K. M. et al. EGF activates its receptor by removing interactions that autoinhibit ectodomain dimerization. Molecular Cell 11, 507–517 (2003). 63. Jones, R. B., Gordus, A., Krall, J. A. & MacBeath, G. A quantitative protein interaction network for the ErbB receptors using protein microarrays. Nature 439, 168–174 (2005). 64. Schlessinger, J. et al. Regulation of cell proliferation by epidermal growth factor. CRC Crit. Rev. Biochem. 14, 93–111 (1983). 65. SEWELL, J., SMYTH, J. & LANGDON, S. Role of TGFα stimulation of the ERK, PI3 kinase and PLCγ pathways in ovarian cancer growth and migration. Experimental Cell Research 304, 305–316 (2005). 66. McClintock, J. L. & Ceresa, B. P. Transforming Growth Factor- Enhances Corneal Epithelial Cell Migration by Promoting EGFR Recycling. Investigative Ophthalmology & Visual Science 51, 3455–3461 (2010). 67. Joslin, E. J., Opresko, L. K., Wells, A., Wiley, H. S. & Lauffenburger, D. A. EGF-receptor-mediated mammary epithelial cell migration is driven by sustained ERK signaling from autocrine stimulation. Journal of Cell Science 120, 3688– 3699 (2007). 68. Mukhopadhyay, C., Zhao, X., Maroni, D., Band, V. & Naramura, M. Distinct Effects of EGFR Ligands on Human Mammary Epithelial Cell Differentiation. PLoS ONE 8, e75907 (2013). 60 69. Roepstorff, K. et al. Differential Effects of EGFR Ligands on Endocytic Sorting of the Receptor. Traffic 10, 1115–1127 (2009). 70. Wilson, K. J., Gilmore, J. L., Foley, J., Lemmon, M. A. & Riese, D. J. Functional selectivity of EGF family peptide growth factors: implications for cancer. Pharmacol. Ther. 122, 1–8 (2009). 71. Harris, R. C., Chung, E. & Coffey, R. J. EGF receptor ligands. Experimental Cell Research 284, 2–13 (2003). 72. Iwamoto, R. & Mekada, E. Heparin-binding EGF-like growth factor: a juxtacrine growth factor. Cytokine Growth Factor Rev. 11, 335–344 (2000). 73. Sahin, U. et al. Distinct roles for ADAM10 and ADAM17 in ectodomain shedding of six EGFR ligands. J. Cell Biol. 164, 769–779 (2004). 74. Lu, X. et al. ADAMTS1 and MMP1 proteolytically engage EGF-like ligands in an osteolytic signaling cascade for bone metastasis. Genes Dev. 23, 1882–1894 (2009). 75. Chokki, M., Eguchi, H., Hamamura, I., Mitsuhashi, H. & Kamimura, T. Human airway trypsin-like protease induces amphiregulin release through a mechanism involving protease-activated receptor-2-mediated ERK activation and TNF alpha-converting enzyme activity in airway epithelial cells. FEBS J. 272, 6387– 6399 (2005). 76. Wilson, K. J. et al. EGFR ligands exhibit functional differences in models of paracrine and autocrine signaling. Growth Factors 30, 107–116 (2012). 77. Grøvdal, L. M., Stang, E., Sorkin, A. & Madshus, I. H. Direct interaction of Cbl with pTyr 1045 of the EGF receptor (EGFR) is required to sort the EGFR to lysosomes for degradation. Experimental Cell Research 300, 388–395 (2004). 78. Chung, E., Graves-Deal, R., Franklin, J. L. & Coffey, R. J. Differential effects of amphiregulin and TGF-α on the morphology of MDCK cells. Experimental Cell Research 309, 149–160 (2005). 79. Citri, A. & Yarden, Y. EGF–ERBB signalling: towards the systems level. Nat Rev Mol Cell Biol 7, 505–516 (2006). 80. Pawson, T. Protein modules and signalling networks. Nature 373, 573–580 (1995). 61 81. Nickerson, N. K. et al. in (Gunduz, M. & Gunduz, E.) 1–31 (InTech, 2014). 82. Witsch, E., Sela, M. & Yarden, Y. Roles for growth factors in cancer progression. Physiology (Bethesda) 25, 85–101 (2010). 83. Yarden, Y. & Sliwkowski, M. X. Untangling the ErbB signalling network. Nat Rev Mol Cell Biol 2, 127–137 (2001). 84. Avraham, R. & Yarden, Y. Feedback regulation of EGFR signalling: decision making by early and delayed loops. Nat Rev Mol Cell Biol 12, 104–117 (2011). 85. Todaro, G. J., Fryling, C. & De Larco, J. E. Transforming growth factors produced by certain human tumor cells: polypeptides that interact with epidermal growth factor receptors. Proc. Natl. Acad. Sci. U.S.A. 77, 5258–5262 (1980). 86. Yasui, W. et al. Expression of transforming growth factor alpha in human tissues: immunohistochemical study and northern blot analysis. Virchows Arch A Pathol Anat Histopathol 421, 513–519 (1992). 87. Jones, J. T., Akita, R. W. & Sliwkowski, M. X. Binding specificities and affinities of egf domains for ErbB receptors. FEBS Letters 447, 227–231 (1999). 88. Winkler, M. E., O'Connor, L., Winget, M. & Fendly, B. Epidermal growth factor and transforming growth factor alpha bind differently to the epidermal growth factor receptor. Biochemistry 28, 6373–6378 (1989). 89. Ebner, R. & Derynck, R. Epidermal growth factor and transforming growth factor-alpha: differential intracellular routing and processing of ligand-receptor complexes. Cell Regul. 2, 599–612 (1991). 90. French, A. R., Tadaki, D. K., Niyogi, S. K. & Lauffenburger, D. A. Intracellular trafficking of epidermal growth factor family ligands is directly influenced by the pH sensitivity of the receptor/ligand interaction. The Journal of Biological Chemistry 270, 4334–4340 (1995). 91. Waterman, H. Alternative Intracellular Routing of ErbB Receptors May Determine Signaling Potency. Journal of Biological Chemistry 273, 13819– 13827 (1998). 92. Reddy, C. C., Niyogi, S. K., Wiley, H. S. & Lauffenburger, D. A. Engineering epidermal growth factor for enhanced mitogenic potency. Nature Biotechnolgy 14, 1696–1700 (1996). 62 93. Shoyab, M., Plowman, G. D., McDonald, V. L., Bradley, J. G. & Todaro, G. J. Structure and function of human amphiregulin: a member of the epidermal growth factor family. Science 243, 1074–1076 (1989). 94. Shoyab, M., McDonald, V. L., Bradley, J. G. & Todaro, G. J. Amphiregulin: A bifunctional growth-modulating glycoprotein produced by the phorbol 12myristate 13-acetate-treated human breast adenocarcinoma cell line MCF-7. PNAS 85, 6528–6532 (1988). 95. Barnhard, J. A. et al. Auto- and Cross-induction within the Mammalian Epidermal Growth Factor-related Peptide Family. The Journal of Biological Chemistry 269, 22817–22822 (1994). 96. Cook, P. W. et al. A heparin sulfate-regulated human keratinocyte autocrine factor is similar or identical to amphiregulin. 11, 2547–2557 (2003). 97. Neelam, B. et al. Structure-Function Studies of Ligand-Induced Epidermal Growth Factor ReceptorDimerization. Biochemistry 37, 4884–4891 (1998). 98. Johnson, G. R. & Wong, L. Heparan Sulfate Is Essential to Amphiregulininduced Mitogenic Signaling by the Epidermal GRowth Factor Receptor. The Journal of Biological Chemistry 169, 27149–27154 (1994). 99. Willmarth, N. E. & Ethier, S. P. Amphiregulin as a Novel Target for Breast Cancer Therapy. J Mammary Gland Biol Neoplasia 13, 171–179 (2008). 100. Willmarth, N. E. et al. Cellular Signalling. Cellular Signalling 21, 212–219 (2009). 101. Stern, K. A., Place, T. L. & Lill, N. L. EGF and amphiregulin differentially regulate Cbl recruitment to endosomes and EGF receptor fate. Biochem. J. 410, 585–594 (2008). 102. Wortzel, I. & Seger, R. The ERK Cascade: Distinct Functions within Various Subcellular Organelles. Genes Cancer 2, 195–209 (2011). 103. Roskoski, R. RAF protein-serine/threonine kinases: structure and regulation. Biochem. Biophys. Res. Commun. 399, 313–317 (2010). 104. Roskoski, R. MEK1/2 dual-specificity protein kinases: structure and regulation. Biochem. Biophys. Res. Commun. 417, 5–10 (2012). 105. Fujioka, A. et al. Dynamics of the Ras/ERK MAPK cascade as monitored by 63 fluorescent probes. The Journal of Biological Chemistry 281, 8917–8926 (2006). 106. Yoon, S. & Seger, R. The extracellular signal-regulated kinase: multiple substrates regulate diverse cellular functions. Growth Factors 24, 21–44 (2006). 107. Kumar, A., Rajendran, V., Sethumadhavan, R. & Purohit, R. AKT kinase pathway: a leading target in cancer research. ScientificWorldJournal 2013, 756134 (2013). 108. Bardwell, L. & Shah, K. Analysis of mitogen-activated protein kinase activation and interactions with regulators and substrates. Methods 40, 213–223 (2006). 109. Lefloch, R., Pouysségur, J. & Lenormand, P. Total ERK1/2 activity regulates cell proliferation. Cell Cycle (Georgetown, Tex.) 8, 705–711 (2009). 110. Bos, J. L. ras oncogenes in human cancer: a review. Cancer Research 49, 4682– 4689 (1989). 111. Davies, H. et al. Mutations of the BRAF gene in human cancer. Nature 417, 949–954 (2002). 112. Vakiani, E. & Solit, D. B. KRAS and BRAF: drug targets and predictive biomarkers. J. Pathol. 223, 220–230 (2010). 113. Garnett, M. J. & Marais, R. Guilty as charged: B-RAF is a human oncogene. Cancer Cell 6, 313–319 (2004). 114. Lloyd, A. C. Distinct functions for ERKs? J. Biol. 5, 13 (2006). 115. Lefloch, R., Pouysségur, J. & Lenormand, P. Single and combined silencing of ERK1 and ERK2 reveals their positive contribution to growth signaling depending on their expression levels. Mol. Cell. Biol. 28, 511–527 (2008). 116. Adams, J. A. Activation Loop Phosphorylation and Catalysis in Protein Kinases: Is There Functional Evidence for the Autoinhibitor Model? †. Biochemistry 42, 601–607 (2003). 117. Haystead, T. A., Dent, P., Wu, J., Haystead, C. M. & Sturgill, T. W. Ordered phosphorylation of p42mapk by MAP kinase kinase. FEBS Letters 306, 17–22 (1992). 118. Burack, W. R. & Sturgill, T. W. The activating dual phosphorylation of MAPK by MEK is nonprocessive. Biochemistry 36, 5929–5933 (1997). 119. 64 Ferrell, J. E. & Bhatt, R. R. Mechanistic studies of the dual phosphorylation of mitogen-activated protein kinase. The Journal of Biological Chemistry 272, 19008–19016 (1997). 120. Prowse, C. N. & Lew, J. Mechanism of activation of ERK2 by dual phosphorylation. The Journal of Biological Chemistry 276, 99–103 (2001). 121. Coffer, P. J. & Woodgett, J. R. Molecular cloning and characterisation of a novel putative protein-serine kinase related to the cAMP-dependent and protein kinase C families. Eur. J. Biochem. 201, 475–481 (1991). 122. Jones, P. F., Jakubowicz, T., Pitossi, F. J., Maurer, F. & Hemmings, B. A. Molecular cloning and identification of a serine/threonine protein kinase of the second-messenger subfamily. Proc. Natl. Acad. Sci. U.S.A. 88, 4171–4175 (1991). 123. Cheng, J. Q. et al. AKT2, a putative oncogene encoding a member of a subfamily of protein-serine/threonine kinases, is amplified in human ovarian carcinomas. Proc. Natl. Acad. Sci. U.S.A. 89, 9267–9271 (1992). 124. Konishi, H. et al. Molecular cloning and characterization of a new member of the RAC protein kinase family: association of the pleckstrin homology domain of three types of RAC protein kinase with protein kinase C subspecies and beta gamma subunits of G proteins. Biochem. Biophys. Res. Commun. 216, 526–534 (1995). 125. Bellacosa, A. et al. Structure, expression and chromosomal mapping of c-akt: relationship to v-akt and its implications. Oncogene 8, 745–754 (1993). 126. Altomare, D. A. et al. Cloning, chromosomal localization and expression analysis of the mouse Akt2 oncogene. Oncogene 11, 1055–1060 (1995). 127. Altomare, D. A., Lyons, G. E., Mitsuuchi, Y., Cheng, J. Q. & Testa, J. R. Akt2 mRNA is highly expressed in embryonic brown fat and the AKT2 kinase is activated by insulin. Oncogene 16, 2407–2411 (1998). 128. Franke, T. F., Tartof, K. D. & Tsichlis, P. N. The SH2-like Akt homology (AH) domain of c-akt is present in multiple copies in the genome of vertebrate and invertebrate eucaryotes. Cloning and characterization of the Drosophila melanogaster c-akt homolog Dakt1. Oncogene 9, 141–148 (1994). 129. Paradis, S. & Ruvkun, G. Caenorhabditis elegans Akt/PKB transduces insulin 65 receptor-like signals from AGE-1 PI3 kinase to the DAF-16 transcription factor. Genes Dev. 12, 2488–2498 (1998). 130. Datta, S. R., Brunet, A. & Greenberg, M. E. Cellular survival: a play in three Akts. Genes Dev. 13, 2905–2927 (1999). 131. Bellacosa, A., Testa, J. R., Staal, S. P. & Tsichlis, P. N. A retroviral oncogene, akt, encoding a serine-threonine kinase containing an SH2-like region. Science 254, 274–277 (1991). 132. Alessi, D. R. et al. Mechanism of activation of protein kinase B by insulin and IGF-1. EMBO J. 15, 6541–6551 (1996). 133. Nishida, K., Kaziro, Y. & Satoh, T. Anti-apoptotic function of Rac in hematopoietic cells. Oncogene 18, 407–415 (1999). 134. Rameh, L. E. & Cantley, L. C. The role of phosphoinositide 3-kinase lipid products in cell function. The Journal of Biological Chemistry 274, 8347–8350 (1999). 135. Frech, M. et al. High affinity binding of inositol phosphates and phosphoinositides to the pleckstrin homology domain of RAC/protein kinase B and their influence on kinase activity. The Journal of Biological Chemistry 272, 8474–8481 (1997). 136. Franke, T. F., Kaplan, D. R., Cantley, L. C. & Toker, A. Direct regulation of the Akt proto-oncogene product by phosphatidylinositol-3,4-bisphosphate. Science 275, 665–668 (1997). 137. Franke, T. F. et al. The protein kinase encoded by the Akt proto-oncogene is a target of the PDGF-activated phosphatidylinositol 3-kinase. Cell 81, 727–736 (1995). 138. Alessi, D. R. et al. Characterization of a 3-phosphoinositide-dependent protein kinase which phosphorylates and activates protein kinase Balpha. Curr. Biol. 7, 261–269 (1997). 139. Andjelkovic, M. et al. Role of translocation in the activation and function of protein kinase B. The Journal of Biological Chemistry 272, 31515–31524 (1997). 140. Datta, K. et al. AH/PH domain-mediated interaction between Akt molecules and its potential role in Akt regulation. Mol. Cell. Biol. 15, 2304–2310 (1995). 66 141. Kohn, A. D., Takeuchi, F. & Roth, R. A. Akt, a pleckstrin homology domain containing kinase, is activated primarily by phosphorylation. The Journal of Biological Chemistry 271, 21920–21926 (1996). 142. Staal, S. P. Molecular cloning of the akt oncogene and its human homologues AKT1 and AKT2: amplification of AKT1 in a primary human gastric adenocarcinoma. Proc. Natl. Acad. Sci. U.S.A. 84, 5034–5037 (1987). 143. Staal, S. P., Hartley, J. W. & Rowe, W. P. Isolation of transforming murine leukemia viruses from mice with a high incidence of spontaneous lymphoma. Proc. Natl. Acad. Sci. U.S.A. 74, 3065–3067 (1977). 144. Burgering, B. M. & Coffer, P. J. Protein kinase B (c-Akt) in phosphatidylinositol-3-OH kinase signal transduction. Nature 376, 599–602 (1995). 145. Dudek, H. et al. Regulation of neuronal survival by the serine-threonine protein kinase Akt. Science 275, 661–665 (1997). 146. Coffer, P. J., Jin, J. & Woodgett, J. R. Protein kinase B (c-Akt): a multifunctional mediator of phosphatidylinositol 3-kinase activation. Biochem. J. 335 ( Pt 1), 1– 13 (1998). 147. Pignataro, O. P. & Ascoli, M. Epidermal growth factor increases the labeling of phosphatidylinositol 3,4-bisphosphate in MA-10 Leydig tumor cells. The Journal of Biological Chemistry 265, 1718–1723 (1990). 148. Cantley, L. C. The phosphoinositide 3-kinase pathway. Science 296, 1655–1657 (2002). 149. Yuan, T. L. & Cantley, L. C. PI3K pathway alterations in cancer: variations on a theme. Oncogene 27, 5497–5510 (2008). 150. Engelman, J. A., Luo, J. & Cantley, L. C. The evolution of phosphatidylinositol 3-kinases as regulators of growth and metabolism. Nat. Rev. Genet. 7, 606–619 (2006). 151. Crabbe, T., Welham, M. J. & Ward, S. G. The PI3K inhibitor arsenal: choose your weapon! Trends Biochem. Sci. 32, 450–456 (2007). 152. Wymann, M. P. et al. Wortmannin inactivates phosphoinositide 3-kinase by covalent modification of Lys-802, a residue involved in the phosphate transfer 67 reaction. Mol. Cell. Biol. 16, 1722–1733 (1996). 153. Walker, E. H. et al. Structural determinants of phosphoinositide 3-kinase inhibition by wortmannin, LY294002, quercetin, myricetin, and staurosporine. Molecular Cell 6, 909–919 (2000). 154. Pelicci, G. et al. A novel transforming protein (SHC) with an SH2 domain is implicated in mitogenic signal transduction. Cell 70, 93–104 (1992). 155. Wills, M. K. B. & Jones, N. Teaching an old dogma new tricks: twenty years of Shc adaptor signalling. Biochem. J. 447, 1–16 (2012). 156. Ventura, A., Luzi, L., Pacini, S., Baldari, C. T. & Pelicci, P. G. The p66Shc longevity gene is silenced through epigenetic modifications of an alternative promoter. The Journal of Biological Chemistry 277, 22370–22376 (2002). 157. Lai, K. M. & Pawson, T. The ShcA phosphotyrosine docking protein sensitizes cardiovascular signaling in the mouse embryo. Genes Dev. 14, 1132–1145 (2000). 158. Migliaccio, E. et al. Opposite effects of the p52shc/p46shc and p66shc splicing isoforms on the EGF receptor-MAP kinase-fos signalling pathway. EMBO J. 16, 706–716 (1997). 159. Ravichandran, K. S. Signaling via Shc family adapter proteins. Oncogene 20, 6322–6330 (2001). 160. Bhattacharyya, R. P., Reményi, A., Yeh, B. J. & Lim, W. A. Domains, motifs, and scaffolds: the role of modular interactions in the evolution and wiring of cell signaling circuits. Annu. Rev. Biochem. 75, 655–680 (2006). 161. Luzi, L., Confalonieri, S., Di Fiore, P. P. & Pelicci, P. G. Evolution of Shc functions from nematode to human. Curr. Opin. Genet. Dev. 10, 668–674 (2000). 162. van der Geer, P., Wiley, S., Gish, G. D. & Pawson, T. The Shc adaptor protein is highly phosphorylated at conserved, twin tyrosine residues (Y239/240) that mediate protein-protein interactions. Curr. Biol. 6, 1435–1444 (1996). 163. Charest, A., Wagner, J., Jacob, S., McGlade, C. J. & Tremblay, M. L. Phosphotyrosine-independent binding of SHC to the NPLH sequence of murine protein-tyrosine phosphatase-PEST. Evidence for extended phosphotyrosine binding/phosphotyrosine interaction domain recognition specificity. The Journal 68 of Biological Chemistry 271, 8424–8429 (1996). 164. Habib, T., Herrera, R. & Decker, S. J. Activators of protein kinase C stimulate association of Shc and the PEST tyrosine phosphatase. The Journal of Biological Chemistry 269, 25243–25246 (1994). 165. Gu, J. et al. Shc and FAK differentially regulate cell motility and directionality modulated by PTEN. J. Cell Biol. 146, 389–403 (1999). 166. Schneider, E. et al. Migration of renal tumor cells depends on dephosphorylation of Shc by PTEN. Int. J. Oncol. 38, 823–831 (2011). 167. Ugi, S., Imamura, T., Ricketts, W. & Olefsky, J. M. Protein phosphatase 2A forms a molecular complex with Shc and regulates Shc tyrosine phosphorylation and downstream mitogenic signaling. Mol. Cell. Biol. 22, 2375–2387 (2002). 168. Kraut-Cohen, J., Muller, W. J. & Elson, A. Protein-tyrosine phosphatase epsilon regulates Shc signaling in a kinase-specific manner: increasing coherence in tyrosine phosphatase signaling. The Journal of Biological Chemistry 283, 4612– 4621 (2008). 169. Zhou, M. M. et al. Structure and ligand recognition of the phosphotyrosine binding domain of Shc. Nature 378, 584–592 (1995). 170. Ravichandran, K. S. et al. Evidence for a requirement for both phospholipid and phosphotyrosine binding via the Shc phosphotyrosine-binding domain in vivo. Mol. Cell. Biol. 17, 5540–5549 (1997). 171. Dengjel, J. et al. Quantitative proteomic assessment of very early cellular signaling events. Nat. Biotechnol. 25, 566–568 (2007). 172. Dankort, D. L., Wang, Z., Blackmore, V., Moran, M. F. & Muller, W. J. Distinct tyrosine autophosphorylation sites negatively and positively modulate neumediated transformation. Mol. Cell. Biol. 17, 5410–5425 (1997). 173. Dankort, D. et al. Grb2 and Shc Adapter Proteins Play Distinct Roles in Neu (ErbB-2)-Induced Mammary Tumorigenesis: Implications for Human Breast Cancer. Mol. Cell. Biol. 21, 1540–1551 (2001). 174. Ursini-Siegel, J. et al. ShcA signalling is essential for tumour progression in mouse models of human breast cancer. EMBO J. 27, 910–920 (2008). 175. Oksvold, M. P., Skarpen, E., Lindeman, B., Roos, N. & Huitfeldt, H. S. 69 Immunocytochemical localization of Shc and activated EGF receptor in early endosomes after EGF stimulation of HeLa cells. J. Histochem. Cytochem. 48, 21–33 (2000). 176. Sakaguchi, K., Okabayashi, Y. & Kasuga, M. Shc mediates ligand-induced internalization of epidermal growth factor receptors. Biochem. Biophys. Res. Commun. 282, 1154–1160 (2001). 177. Sato, K. et al. Adaptor protein Shc undergoes translocation and mediates upregulation of the tyrosine kinase c-Src in EGF-stimulated A431 cells. Genes Cells 5, 749–764 (2000). 178. Suenaga, A. et al. Tyr-317 phosphorylation increases Shc structural rigidity and reduces coupling of domain motions remote from the phosphorylation site as revealed by molecular dynamics simulations. The Journal of Biological Chemistry 279, 4657–4662 (2004). 179. Suenaga, A. et al. Molecular dynamics simulations reveal that Tyr-317 phosphorylation reduces Shc binding affinity for phosphotyrosyl residues of epidermal growth factor receptor. Biophys. J. 96, 2278–2288 (2009). 180. Okabayashi, Y. et al. Tyrosines 1148 and 1173 of activated human epidermal growth factor receptors are binding sites of Shc in intact cells. The Journal of Biological Chemistry 269, 18674–18678 (1994). 181. Gotoh, N. et al. Epidermal growth factor-receptor mutant lacking the autophosphorylation sites induces phosphorylation of Shc protein and ShcGrb2/ASH association and retains mitogenic activity. Proc. Natl. Acad. Sci. U.S.A. 91, 167–171 (1994). 182. Maegawa, M. et al. Epidermal growth factor receptor lacking C-terminal autophosphorylation sites retains signal transduction and high sensitivity to epidermal growth factor receptor tyrosine kinase inhibitor. Cancer Sci. 100, 552– 557 (2009). 183. Batzer, A. G., Rotin, D., Ureña, J. M., Skolnik, E. Y. & Schlessinger, J. Hierarchy of binding sites for Grb2 and Shc on the epidermal growth factor receptor. Mol. Cell. Biol. 14, 5192–5201 (1994). 184. 70 Sakaguchi, K. et al. Shc phosphotyrosine-binding domain dominantly interacts with epidermal growth factor receptors and mediates Ras activation in intact cells. Mol. Endocrinol. 12, 536–543 (1998). 185. Batzer, A. G., Blaikie, P., Nelson, K., Schlessinger, J. & Margolis, B. The phosphotyrosine interaction domain of Shc binds an LXNPXY motif on the epidermal growth factor receptor. Mol. Cell. Biol. 15, 4403–4409 (1995). 186. Soler, C., Beguinot, L. & Carpenter, G. Individual epidermal growth factor receptor autophosphorylation sites do not stringently define association motifs for several SH2-containing proteins. The Journal of Biological Chemistry 269, 12320–12324 (1994). 187. Thomas, D. & Bradshaw, R. A. Differential utilization of ShcA tyrosine residues and functional domains in the transduction of epidermal growth factor-induced mitogen-activated protein kinase activation in 293T cells and nerve growth factor-induced neurite outgrowth in PC12 cells. Identification of a new Grb2.Sos1 binding site. The Journal of Biological Chemistry 272, 22293–22299 (1997). 188. Gong, Y. & Zhao, X. Shc-dependent pathway is redundant but dominant in MAPK cascade activation by EGF receptors: a modeling inference. FEBS Letters 554, 467–472 (2003). 189. Holgado-Madruga, M., Emlet, D. R., Moscatello, D. K., Godwin, A. K. & Wong, A. J. A Grb2-associated docking protein in EGF- and insulin-receptor signaling. Nature 378, 560–565 (1996). 190. Gu, H. & Neel, B. G. The ‘Gab’ in signal transduction. Trends Cell Biol. 13, 122–130 (2003). 191. Schaeper, U. et al. Coupling of Gab1 to c-Met, Grb2, and Shp2 Mediates Biological Responses. J. Cell Biol. 149, 1419–1432 (2000). 192. Lock, L. S., Royal, I., Naujokas, M. A. & Park, M. Identification of an atypical Grb2 carboxyl-terminal SH3 domain binding site in Gab docking proteins reveals Grb2-dependent and -independent recruitment of Gab1 to receptor tyrosine kinases. The Journal of Biological Chemistry 275, 31536–31545 (2000). 193. Nguyen, L. et al. Association of the Multisubstrate Docking Protein Gab1 with the Hepatocyte Growth Factor Receptor Requires a Functional Grb2 Binding Site 71 Involving Tyrosine 1356*. Journal of Biological Chemistry 272, 20811–20819 (1997). 194. Bardelli, A., Longati, P., Gramaglia, D., Stella, M. C. & Comoglio, P. M. Gab1 coupling to the HGF/Met receptor multifunctional docking site requires binding of Grb2 and correlates with the transforming potential. Oncogene 15, 3103–3111 (1997). 195. Hibi, M. & Hirano, T. Gab-family adapter molecules in signal transduction of cytokine and growth factor receptors, and T and B cell antigen receptors. Leuk. Lymphoma 37, 299–307 (2000). 196. Liu, Y. & Rohrschneider, L. R. The gift of Gab. FEBS Letters 515, 1–7 (2002). 197. Schutzman, J. L. et al. The Caenorhabditis elegans EGL-15 Signaling Pathway Implicates a DOS-Like Multisubstrate Adaptor Protein in Fibroblast Growth Factor Signal Transduction. Mol. Cell. Biol. 21, 8104–8116 (2001). 198. Abbeyquaye, T., Riesgo-Escovar, J., Raabe, T. & Thackeray, J. R. Evolution of Gab family adaptor proteins. Gene 311, 43–50 (2003). 199. Nishida, K. et al. Gab-Family Adaptor Proteins Act Downstream of Cytokine and Growth Factor Receptors and T- and B-Cell Antigen Receptors. Blood 93, 1809–1816 (1999). 200. Garcia-Guzman, M., Dolfi, F., Zeh, K. & Vuori, K. Met-induced JNK activation is mediated by the adapter protein Crk and correlates with the Gab1 ± Crk signaling complex formation. Oncogene 7775–7786 (1999). 201. Crouin, C., Arnaud, M., Gesbert, F., Camonis, J. & Bertoglio, J. A yeast twohybrid study of human p97/Gab2 interactions with its SH2 domain-containing binding partners. FEBS Letters 148–153 (2001). 202. Sakkab, D. et al. Signaling of Hepatocyte Growth Factor/Scatter Factor (HGF) to the Small GTPase Rap1 via the Large Docking Protein Gab1 and the Adapter Protein CRKL. Journal of Biological Chemistry 275, 10772–10778 (2000). 203. Lecoq-Lafon, C. et al. Erythropoietin Induces the Tyrosine Phosphorylation of GAB1 and Its Association With SHC, SHP2, SHIP, and Phosphatidylinositol 3Kinase. Blood 2578–2585 (1999). 204. 72 Lee, A. W. M. & States, D. J. Both Src-Dependent and -Independent Mechanisms MediatePhosphatidylinositol 3-Kinase Regulation of ColonyStimulatingFactor 1-Activated Mitogen-Activated Protein Kinasesin Myeloid Progenitors. Mol. Cell. Biol. 20, 6779–6798 (2000). 205. Yamasaki, S. Docking Protein Gab2 Is Phosphorylated by ZAP-70 and Negatively Regulates T Cell Receptor Signaling by Recruitment of Inhibitory Molecules. Journal of Biological Chemistry 276, 45175–45183 (2001). 206. Guo, L. et al. Studies of ligand-induced site-specific phosphorylation of epidermal growth factor receptor. J. Am. Soc. Mass Spectrom. 14, 1022–1031 (2003). 207. Nishida, K. & Hirano, T. The role of Gab family scaffolding adapter proteins in the signal transduction of cytokine and growth factor receptors. Cancer Sci. 94, 1029–1033 (2003). 208. Lowenstein, E. J. et al. The SH2 and SH3 Domain-Containing Protein GRB2 Links Receptor Tyrosine Kinases to ras Signaling. Cell 70, 431–442 (1992). 209. Buday, L. & Downward, J. Epidermal Growth Factor Regulates p21lilsthrough the Formation of a Complex of Receptor, Grb2Adapter Protein. Cell 73, 611–620 (1993). 210. Gale, N. W., Kaplan, S., Lowenstein, E. J., Schlessinger, J. & Bar-Sagi, D. Grb2 mediates the EGF-dependent activation of guanine nucleotide exchange on Ras. Nature 363, 88–92 (1993). 211. Simon, M. A., Bowtell, D. D., Dodson, G. S., Laverty, T. R. & Rubin, G. M. Rasl and a Putative Guanine Nucleotide Exchange Factor Perform Crucial Steps in Signalingby the Sevenless Protein Tyrosine Kinase. Cell 67, 701–716 (1991). 212. Li, N. et al. Guanine-nucleotide-releasing factor hSos1 binds to Grb2 and links receptor tyrosine kinases to Ras signalling. Nature 363, 85–87 (1993). 213. Agazie, Y. M., Movilla, N., Ischenko, I. & Hayman, M. J. The phosphotyrosine phosphatase SHP2 is a critical mediator of transformation induced by the oncogenic fibroblast growth factor receptor 3. Oncogene 22, 6909–6918 (2003). 214. Skolnik, E. Y. et al. The SH2/SH3 domain-containing protein. EMBO J. 12, 1929–1936 (1993). 215. Lin, C.-C. et al. Inhibition of Basal FGF Receptor Signaling by Dimeric Grb2. 73 Cell 149, 1514–1524 (2012). 216. Belov, A. A. & Mohammadi, M. Grb2, a Double-Edged Sword of Receptor Tyrosine Kinase Signaling. Sci Signal 5, pe49–pe49 (2012). 217. Sun, X. J. et al. Structure of the insulin receptor substrate IRS-1 defines a unique signal transduction protein. Nature 352, 73–77 (1991). 218. Sun, X. J. et al. Role of IRS-2 in insulin and cytokine signalling. Nature 377, 173–177 (1995). 219. Gibson, S. L., Ma, Z. & Shaw, L. M. Divergent Roles for IRS-1 and IRS-2 in Breast Cancer Metastatis. Cell Cycle (Georgetown, Tex.) 6, 631–637 (2007). 220. White, M. F. The IRS-signalling system: a network of docking proteins that mediate insulin action. Mol. Cell. Biochem. 182, 3–11 (1998). 221. White, M. F. The IRS-signalling system in insulin and cytokine action. Philos. Trans. R. Soc. Lond., B, Biol. Sci. 351, 181–189 (1996). 222. Gustafson, T. A., He, W., Craparo, A., Schaub, C. D. & O'Neill, T. J. Phosphotyrosine-dependent interaction of SHC and insulin receptor substrate 1 with the NPEY motif of the insulin receptor via a novel non-SH2 domain. Mol. Cell. Biol. 15, 2500–2508 (1995). 223. Shaw, L. M. Identification of insulin receptor substrate 1 (IRS-1) and IRS-2 as signaling intermediates in the alpha6beta4 integrin-dependent activation of phosphoinositide 3-OH kinase and promotion of invasion. Mol. Cell. Biol. 21, 5082–5093 (2001). 224. Myers, M. G. et al. IRS-1 activates phosphatidylinositol 3'-kinase by associating with src homology 2 domains of p85. Proc. Natl. Acad. Sci. U.S.A. 89, 10350– 10354 (1992). 225. Myers, M. G. et al. Role of IRS-1-GRB-2 complexes in insulin signaling. Mol. Cell. Biol. 14, 3577–3587 (1994). 226. Zhang, X. et al. Multiple signaling pathways are activated during insulin-like growth factor-I (IGF-I) stimulated breast cancer cell migration. Breast Cancer Res. Treat. 93, 159–168 (2005). 227. Withers, D. J. et al. Disruption of IRS-2 causes type 2 diabetes in mice. Nature 391, 900–904 (1998). 74 228. Schubert, M. et al. Insulin receptor substrate-2 deficiency impairs brain growth and promotes tau phosphorylation. J. Neurosci. 23, 7084–7092 (2003). 229. Withers, D. J. et al. Irs-2 coordinates Igf-1 receptor-mediated beta-cell development and peripheral insulin signalling. Nat. Genet. 23, 32–40 (1999). 230. Burks, D. J. et al. IRS-2 pathways integrate female reproduction and energy homeostasis. Nature 407, 377–382 (2000). 231. Nagle, J. A., Ma, Z., Byrne, M. A., White, M. F. & Shaw, L. M. Involvement of insulin receptor substrate 2 in mammary tumor metastasis. Mol. Cell. Biol. 24, 9726–9735 (2004). 232. Ma, Z. et al. Suppression of insulin receptor substrate 1 (IRS-1) promotes mammary tumor metastasis. Mol. Cell. Biol. 26, 9338–9351 (2006). 233. Freeman, R. M., Plutzky, J. & Neel, B. G. Identification of a human src homology 2-containing protein-tyrosine-phosphatase: a putative homolog of Drosophila corkscrew. Proc. Natl. Acad. Sci. U.S.A. 89, 11239–11243 (1992). 234. Neel, B. G., Gu, H. & Pao, L. The ‘Shp’ing news: SH2 domain-containing tyrosine phosphatases in cell signaling. Trends Biochem. Sci. 28, 284–293 (2003). 235. Craggs, G. & Kellie, S. A functional nuclear localization sequence in the Cterminal domain of SHP-1. The Journal of Biological Chemistry 276, 23719– 23725 (2001). 236. Yang, W., Tabrizi, M. & Yi, T. A bipartite NLS at the SHP-1 C-terminus mediates cytokine-induced SHP-1 nuclear localization in cell growth control. Blood Cells Mol. Dis. 28, 63–74 (2002). 237. Qu, C. K. The SHP-2 tyrosine phosphatase: signaling mechanisms and biological functions. Cell Res. 10, 279–288 (2000). 238. Li, W. et al. A new function for a phosphotyrosine phosphatase: linking GRB2Sos to a receptor tyrosine kinase. Mol. Cell. Biol. 14, 509–517 (1994). 239. Xiao, S. et al. Syp (SH-PTP2) is a positive mediator of growth factor-stimulated mitogenic signal transduction. The Journal of Biological Chemistry 269, 21244– 21248 (1994). 240. Van Vactor, D., O'Reilly, A. M. & Neel, B. G. Genetic analysis of protein 75 tyrosine phosphatases. Curr. Opin. Genet. Dev. 8, 112–126 (1998). 241. Feng, G. S. Shp-2 tyrosine phosphatase: signaling one cell or many. Experimental Cell Research 253, 47–54 (1999). 242. Barford, D. & Neel, B. G. Revealing mechanisms for SH2 domain mediated regulation of the protein tyrosine phosphatase SHP-2. Structure 6, 249–254 (1998). 243. Hof, P., Pluskey, S., Dhe-Paganon, S., Eck, M. J. & Shoelson, S. E. Crystal structure of the tyrosine phosphatase SHP-2. Cell 92, 441–450 (1998). 244. Langdon, W. Y., Hartley, J. W., Klinken, S. P., Ruscetti, S. K. & Morse, H. C. vcbl, an oncogene from a dual-recombinant murine retrovirus that induces early B-lineage lymphomas. Proc. Natl. Acad. Sci. U.S.A. 86, 1168–1172 (1989). 245. Joazeiro, C. A. et al. The tyrosine kinase negative regulator c-Cbl as a RINGtype, E2-dependent ubiquitin-protein ligase. Science 286, 309–312 (1999). 246. Waterman, H., Levkowitz, G., Alroy, I. & Yarden, Y. The RING finger of c-Cbl mediates desensitization of the epidermal growth factor receptor. The Journal of Biological Chemistry 274, 22151–22154 (1999). 247. Swaminathan, G. & Tsygankov, A. Y. The Cbl family proteins: ring leaders in regulation of cell signaling. J. Cell. Physiol. 209, 21–43 (2006). 248. Feshchenko, E. A. et al. TULA: an SH3- and UBA-containing protein that binds to c-Cbl and ubiquitin. Oncogene 23, 4690–4706 (2004). 249. Kowanetz, K. et al. Suppressors of T-cell receptor signaling Sts-1 and Sts-2 bind to Cbl and inhibit endocytosis of receptor tyrosine kinases. The Journal of Biological Chemistry 279, 32786–32795 (2004). 250. Lupher, M. L. et al. Cbl-mediated negative regulation of the Syk tyrosine kinase. A critical role for Cbl phosphotyrosine-binding domain binding to Syk phosphotyrosine 323. The Journal of Biological Chemistry 273, 35273–35281 (1998). 251. Meng, W., Sawasdikosol, S., Burakoff, S. J. & Eck, M. J. Structure of the aminoterminal domain of Cbl complexed to its binding site on ZAP-70 kinase. Nature 398, 84–90 (1999). 252. 76 Peschard, P., Ishiyama, N., Lin, T., Lipkowitz, S. & Park, M. A conserved DpYR motif in the juxtamembrane domain of the Met receptor family forms an atypical c-Cbl/Cbl-b tyrosine kinase binding domain binding site required for suppression of oncogenic activation. The Journal of Biological Chemistry 279, 29565–29571 (2004). 253. Levkowitz, G. et al. Ubiquitin ligase activity and tyrosine phosphorylation underlie suppression of growth factor signaling by c-Cbl/Sli-1. Molecular Cell 4, 1029–1040 (1999). 254. Thien, C. B., Walker, F. & Langdon, W. Y. RING finger mutations that abolish c-Cbl-directed polyubiquitination and downregulation of the EGF receptor are insufficient for cell transformation. Molecular Cell 7, 355–365 (2001). 255. Keane, M. M., Rivero-Lezcano, O. M., Mitchell, J. A., Robbins, K. C. & Lipkowitz, S. Cloning and characterization of cbl-b: a SH3 binding protein with homology to the c-cbl proto-oncogene. Oncogene 10, 2367–2377 (1995). 256. Andoniou, C. E., Thien, C. B. & Langdon, W. Y. The two major sites of cbl tyrosine phosphorylation in abl-transformed cells select the crkL SH2 domain. Oncogene 12, 1981–1989 (1996). 257. Hunter, S., Burton, E. A., Wu, S. C. & Anderson, S. M. Fyn associates with Cbl and phosphorylates tyrosine 731 in Cbl, a binding site for phosphatidylinositol 3kinase. The Journal of Biological Chemistry 274, 2097–2106 (1999). 258. Grossmann, A. H. et al. Catalytic domains of tyrosine kinases determine the phosphorylation sites within c-Cbl. FEBS Letters 577, 555–562 (2004). 259. Kassenbrock, C. K. & Anderson, S. M. Regulation of ubiquitin protein ligase activity in c-Cbl by phosphorylation-induced conformational change and constitutive activation by tyrosine to glutamate point mutations. The Journal of Biological Chemistry 279, 28017–28027 (2004). 260. Levkowitz, G. et al. c-Cbl/Sli-1 regulates endocytic sorting and ubiquitination of the epidermal growth factor receptor. Genes Dev. 12, 3663–3674 (1998). 261. Haglund, K. et al. Multiple monoubiquitination of RTKs is sufficient for their endocytosis and degradation. Nat. Cell Biol. 5, 461–466 (2003). 262. de Melker, A. A., van der Horst, G., Calafat, J., Jansen, H. & Borst, J. c-Cbl ubiquitinates the EGF receptor at the plasma membrane and remains receptor 77 associated throughout the endocytic route. Journal of Cell Science 114, 2167– 2178 (2001). 263. de Melker, A. A., van der Horst, G. & Borst, J. Ubiquitin ligase activity of c-Cbl guides the epidermal growth factor receptor into clathrin-coated pits by two distinct modes of Eps15 recruitment. The Journal of Biological Chemistry 279, 55465–55473 (2004). 264. de Melker, A. A., van der Horst, G. & Borst, J. c-Cbl directs EGF receptors into an endocytic pathway that involves the ubiquitin-interacting motif of Eps15. Journal of Cell Science 117, 5001–5012 (2004). 265. Katzmann, D. J., Odorizzi, G. & Emr, S. D. Receptor downregulation and multivesicular-body sorting. Nat Rev Mol Cell Biol 3, 893–905 (2002). 266. Longva, K. E. et al. Ubiquitination and proteasomal activity is required for transport of the EGF receptor to inner membranes of multivesicular bodies. J. Cell Biol. 156, 843–854 (2002). 267. Waterman, H. et al. A mutant EGF-receptor defective in ubiquitylation and endocytosis unveils a role for Grb2 in negative signaling. EMBO J. 21, 303–313 (2002). 268. Jiang, X. & Sorkin, A. Epidermal growth factor receptor internalization through clathrin-coated pits requires Cbl RING finger and proline-rich domains but not receptor polyubiquitylation. Traffic 4, 529–543 (2003). 269. Jiang, X., Huang, F., Marusyk, A. & Sorkin, A. Grb2 regulates internalization of EGF receptors through clathrin-coated pits. Mol. Biol. Cell 14, 858–870 (2003). 270. Grishin, A. et al. Involvement of Shc and Cbl-PI 3-kinase in Lyn-dependent proliferative signaling pathways for G-CSF. Oncogene 19, 97–105 (2000). 271. Nita-Lazar, A., Saito-Benz, H. & White, F. M. Quantitative phosphoproteomics by mass spectrometry: past, present, and future. Proteomics 8, 4433–4443 (2008). 272. Hunt, D. F., Yates, J. R., Shabanowitz, J., Winston, S. & Hauer, C. R. Protein sequencing by tandem mass spectrometry. Proc. Natl. Acad. Sci. U.S.A. 83, 6233–6237 (1986). 273. 78 Bell, A. W. et al. A HUPO test sample study reveals common problems in mass spectrometry–based proteomics. Nat Meth 6, 423–430 (2009). 274. Siuzdak, G. Mass Spectrometry for Biotechnology. (Academic Press, 1996). 275. Makarov, A. Electrostatic Axially Harmonic Orbital Trapping: A HighPerformance Technique of Mass Analysis. Anal. Chem. 72, 1156–1162 (2000). 276. Hu, Q. et al. The Orbitrap: a new mass spectrometer. J. Mass Spectrom. 40, 430– 443 (2005). 277. Zubarev, R. A. & Makarov, A. Orbitrap Mass Spectrometry. Anal. Chem. 85, 5288–5296 (2013). 278. Liebler, D. C. & Zimmerman, L. J. Targeted quantification of proteins by mass spectrometry. Biochemistry 52, 3797–3806 (2013). 279. de Hoffman, E. & Stroobant, V. Mass Spectrometry. (John Wiley & Sons, Inc.). 280. Mann, M., Kulak, N. A., Nagaraj, N. & Cox, J. The Coming Age of Complete, Accurate, and Ubiquitous Proteomes. Molecular Cell 49, 583–590 (2013). 281. Robles, M. S., Cox, J. & Mann, M. In-Vivo Quantitative Proteomics Reveals a Key Contribution of Post-Transcriptional Mechanisms to the Circadian Regulation of Liver Metabolism. PLoS Genet 10, e1004047 (2014). 282. White, F. M. The potential cost of high-throughput proteomics. Sci Signal 4, pe8 (2011). 283. Wolf-Yadlin, A., Hautaniemi, S., Lauffenburger, D. A. & White, F. M. Multiple reaction monitoring for robust quantitative proteomic analysis of cellular signaling networks. Proc. Natl. Acad. Sci. U.S.A. 104, 5860–5865 (2007). 284. Klammer, A. A., Reynolds, S. M., Bilmes, J. A., MacCoss, M. J. & Noble, W. S. Modeling peptide fragmentation with dynamic Bayesian networks for peptide identification. Bioinformatics 24, i348–56 (2008). 285. Elias, J. E., Haas, W., Faherty, B. K. & Gygi, S. P. Comparative evaluation of mass spectrometry platforms used in large-scale proteomics investigations. Nat Meth 2, 667–675 (2005). 286. Käll, L., Storey, J. D., MacCoss, M. J. & Noble, W. S. Assigning significance to peptides identified by tandem mass spectrometry using decoy databases. J. Proteome Res. 7, 29–34 (2008). 287. Inge de Dobbeleer, S. E. J. G. H. J. H. A. M. D. C. T. F. S. D. G. The 79 Determination of Anabolic Steroids in Human Urine using the TSQ Quantum XLS. 1–4 (2012). 288. Lange, V., Picotti, P., Domon, B. & Aebersold, R. Selected reaction monitoring for quantitative proteomics: a tutorial. Mol Syst Biol 4, (2008). 289. Ross, P. L. Multiplexed Protein Quantification in Saccharomyces cerevisiae Using Amine-reactive Isobaric Tagging Reagents. Molecular & Cellular Proteomics 3, 1154–1169 (2004). 290. Thompson, A. et al. Tandem Mass Tags: A Novel Quantification Strategy for Comparative Analysis of Complex Protein Mixtures by MS/MS. Anal. Chem. 75, 1895–1904 (2003). 291. Dayon, L. & Sanchez, J.-C. in Methods in Molecular Biology 893, 115–127 (Humana Press, 2012). 292. Ow, S. Y., Salim, M., Noirel, J., Evans, C. & Wright, P. C. Minimising iTRAQ ratio compression through understanding LC-MS elution dependence and highresolution HILIC fractionation. Proteomics 11, 2341–2346 (2011). 293. Mann, M. Functional and quantitative proteomics using SILAC. Nat Rev Mol Cell Biol 7, 952–958 (2006). 294. Wang, W. et al. Quantification of proteins and metabolites by mass spectrometry without isotopic labeling or spiked standards. Anal. Chem. 75, 4818–4826 (2003). 80 2 Computer Assisted Manual Validation (CAMV) 2.1 Introduction Recent advances in mass spectrometry technologies have ushered in a new era of highcontent, high-resolution proteomic datasets. Acquisition of hundreds of thousands of tandem mass spectra (MS/MS spectra) in a single analysis is now routine, and by coupling high-speed data acquisition to sample pre-fractionation, millions of MS/MS spectra can be generated from the analysis of a single biological sample. Despite the technological advances that have enabled acquisition of these massive datasets, tools to accurately identify the peptides and post-translational modification (PTM) sites defined by these tandem mass spectra have evolved less rapidly. In a typical workflow, MS/MS spectra are searched, with database search algorithms such as Sequest1, MASCOT2, XTandem3, or Andromeda4, against protein databases to generate putative peptide and/or PTM identifications for each MS/MS spectrum. Each of these algorithms relies on a scoring system which weights a variety of parameters, including the mass accuracy of the precursor ion m/z and the percentage and/or sequence of fragment ions matching to the theoretical mass and fragmentation pattern of the putative peptide identification. Due to a variety of factors, including the intensity, peptide sequence (including PTMs), complexity of the sample, and fragmentation method, the MS/MS spectra vary greatly in terms of their quality, as defined by their signal-to-noise and complexity. This variation in quality leads to a wide difference in the searching algorithm scores for putative peptide matches. Currently, there are no set ‘thresholds’ or ‘rules’ for determining whether a particular peptide identification is correct, and each database search algorithm weights aspects of the identification differently. With potentially millions of MS/MS spectra per sample, the challenge of sorting through the putative identifications to determine the accuracy of each assignment is monumental, yet is of utmost importance for correct determination of the components within the biological sample. 81 Figure 2-­‐1: Sample complexity prevents identification of peptides from complex samples based solely on accurate precursor mass measurements. Precursor masses from all tryptic fragments from the Human 2009 database with charge states +2 to +4 that fall between m/z 350 and 1500 were considered. Precursor masses were binned into windows generated at four different resolutions. As the number of peptides per bin increases the percentage of total peptides accounted for increases. At 10ppm a vanishingly small percentage of peptides are the sole occupant of their bin making it nearly impossible to accurately identify peptides based on their precursor mass alone. At 1ppm this figure is improved to roughly 5%. This problem is exacerbated when post-­‐translational modifications and missed or non-­‐ tryptic cleavages are included. The difficulty of accurately identifying a peptide defined by a given tandem mass spectrum can be exemplified when one considers how much weight should be given to mass accuracy of the measured precursor ion m/z compared to the theoretical m/z of the putative peptide. As the accuracy of the measured mass improves, the number of potential peptides from a given database matching to that mass decreases significantly. However, mass accuracy alone is typically not sufficient for identification. Figure 2-­‐1 illustrates the issue of relying solely on peptide precursor mass to confirm identification. All tryptic fragments from proteins in the Human 2009 proteome database were in-silico digested and binned based on different accuracies. At 10 ppm, very few peptides are the 82 sole occupant of their m/z bin. At 1 ppm, roughly 5% of peptides are uniquely identifiable from their precursor m/z. The problem becomes more daunting when the database increases in complexity to reflect the complexity found in biological samples: the database should contain missed cleavages, non-tryptic cleavages, and at least the most common dozen of the several hundred potential post-translational or chemical modifications, as all of these are realistic possibilities for any given peptide. Searches performed against a database of this size and complexity would require massive computational resources. Searching algorithms fight an uphill battle against combinatorial explosion as they seek to balance runtime considerations against erroneous exclusion of relevant peptides. Unfortunately, searching against an incomplete database can lead to false positive identifications, simply because the true assignments are not contained within the search space, and therefore the next best match will automatically be reported. Distinguishing between high scoring false positives associated with the ‘next best match’ and true positives is critical, especially given the ultimate goal of utilizing these peptide and PTM assignments to inform biological experimental design5. 2.1.1 Statistical Approaches to Assessing Quality Currently, the most common approaches to assessing the validity of a given set of peptide assignments are based on statistical analyses of the likelihood of incorrect assignments, calculated as either a false-positive or false-discovery rate (FDR). In this approach, the MS/MS spectra are searched against a forward database and also against a decoy, reversed or scrambled, protein database6. The score thresholds that separate correct from incorrect peptide assignments are then altered to achieve a pre-determined false discovery rate, defined as the quotient of the number of matches in the randomized decoy database to the number of matches in the target database. While this approach can be used to rapidly assess the quality of the overall set of peptide assignments, there are several factors that need to be considered to accurately calculate the FDR. For instance, construction of an appropriate decoy database is crucial, as the distribution of peptide lengths, amino acid composition, and motif prevalence must be tuned to match that of the species database and the specific enrichment experiment performed. In addition, replicate identifications can artificially deflate the FDR if they are considered as independent tests. Several dozen MS/MS spectra might be generated for an abundant species eluting from 83 the chromatography column (these replicate spectra are the basis for the label free spectral counting quantitative approach); it is expected that each of these spectra will match to the same peptide sequence in the forward database. Since these MS/MS spectra are effectively replicates of the same MS/MS spectrum, if one of the spectra does not match to the decoy database, then it is likely that all of the spectra will not match to the decoy database. Instead of considering these as independent tests with several dozen hits with no decoy hits, corresponding to a low FDR, this set of data should be considered as one hit with no decoy hit. Considering these factors should significantly improve the accuracy of the FDR-based statistical estimate of global data quality. It is worth noting that the FDR-based statistical approach only provides a global quality metric and fails to identify which assignments are true vs. false-positives, leaving doubt about the validity of any given peptide assignment in the dataset. In fact, it is only on further manual inspection that the quality of each peptide assignment becomes evident, even for MS/MS spectra with similar scores for given assignments. For instance, two peptide identifications with similar MASCOT scores are presented in Figure 2-­‐2. The confidence in the accuracy of the assignment is much higher for the spectrum in Figure 2-­‐2a, as almost all fragment ions match to the expected theoretical fragment ions from the assigned peptide. By comparison, in Figure 2-­‐2b there are multiple intense ions that do not correspond to any of the typically theoretical fragment ions for the given peptide assignment, and therefore this MS/MS spectrum is likely to represent either an incorrect assignment or potentially a ‘contaminated’ spectrum resulting from the simultaneous isolation and fragmentation of multiple ions. In most cases, the database score reflects the number of matched fragment ions, but does not consider the number of unmatched abundant ions, as can be seen by the varied percentage of unmatched ions for similar database scores in Figure 2-­‐3. Based on this analysis, it appears that any global score threshold will automatically include low-confidence identifications, defined as spectra with a fair number of unassigned abundant fragment ions. 84 a)# b)# Figure 2-­‐2: MASCOT score only provides a rough estimate of the quality of the spectral assignment. Two scans with comparable MASCOT scores: (A) MASCOT score 27.12 and (B) MASCOT score 26.7 were selected. Note the significant disparity in the number of unassigned peaks in the two spectra. 85 Figure 2-­‐3: Variation in percentage of unassigned peaks for a given MASCOT score can also be visualized at the dataset level. For this analysis, all peptide matches with MASCOT score above 25 from a representative phosphotyrosine enriched analysis were included. In a given spectrum, all of the peaks above ten percent of maximum intensity or above 2.5 percent in a sparsely populated region within a +/-­‐ 25 m/z window were included for consideration. An alternative to the FDR approach is to use machine learning-based techniques to automate the validation process. A decision tree validation scheme has been shown to reduce the FDR, yet still relies on searches against a general decoy database7. More recently, a hybrid Support Vector Machine (SVM)/ Dynamic Bayes Network (DBN) approach was used to classify MS/MS data, and was shown to increase positive identifications in 1% FDR search results8. To circumvent the need for large amounts of training data for classification methods, another approach is to create a rule-based framework where prominent fragments are predicted based on expert criteria9, although codifying experiential human knowledge still limits results to those peptides that match prescribed criteria, therefore hindering generalization to peptides with different PTM’s. The “expect” score in MASCOT provides another, peptide sequence specific, alternative to the FDR. This score reflects the probability that a peptide assignment with a given MASCOT score would occur by chance, taking into account the length of the peptide 86 along with the sequences of other peptides in the database to judge the likelihood of proper assignment. a)# b)# Figure 2-­‐4: Examples of spectra that were ultimately accepted into the data set. a) Exemplar green identification where the user and the computer picked the same assignment. b) Exemplar blue identification where the user rescued an identification initially excluded by the computer for having a Mascot score below 25. 87 a)# b)# Figure 2-­‐5: Exemplar yellow identification where the user changed the identification made by the computer (A) to (B). The decision was made based on prior knowledge of the sample preparation conditions, which included a pY immunoprecipitation. 88 Figure 2-­‐6: Exemplar red identification where the user rejected an identification initially included by the computer. A number of algorithmic approaches have been proposed to automate the proper localization of PTM’s, one of the key factors in the quality of an assignment. Among the most widely used of these algorithms, ASCORE defines the PTM site(s) by assignment likelihood based on key fragment ion peaks that differentiate multiple putative localizations10. PhosphoRS uses a similar scheme based on more of the ions in the spectrum and has been applied to data sets from a more varied selection of instruments11. Mascot Delta Score (MDScore) leverages the probabilistic calculations included in the Mascot score2 and bases localization confidence on the difference between the Mascot scores for the leading putative assignments. MDScore has been shown to achieve results comparable to ASCORE for localizing tyrosine phosphorylation12. Each of these algorithms is based on the principle that the PTM site assignment that matches the most peaks is correct; however, localization to one out of several closely spaced residues is often difficult, regardless of the algorithm. In these cases, the user can leverage prior knowledge of a site’s biological relevance, a protein’s sequence homology, and the purification procedures employed during sample preparation to assist in defining the most likely assignment of a given site (Figure 2-­‐5). 89 From a larger perspective, all of these automated approaches face the difficult task of limiting false positives while also limiting false negatives, thereby yielding the largest dataset with fewest incorrect assignments. The balance between false-positives and false negatives must be carefully considered, as the potential cost of false positives can be very significant: false positives may distract research efforts and mislead experimental design, potentially costing years of wasted effort5. 2.2 Manual Validation of MS/MS Data Despite the dilemma presented by the specter of false positives and the need to rapidly validate large numbers of peptide assignments, the tools available to more rigorously analyze a dataset have been limited. The gold standard for MS/MS peptide identification verification is manual validation, where precursor and fragment mass observations are manually evaluated against a theoretical fragment ion spectrum from the search algorithm assignment13. Further confirmation can be achieved with synthesis of the putative peptide and chromatography coelution experiments, although these additional steps incur significantly more cost and effort and are therefore typically reserved for the most interesting peptide matches14. Through manual validation, the intensity of particular fragment ions, coupled with the presence, absence, and missed assignment of other fragment ions, can be evaluated by mass spectrometry ‘experts’ to assess the strength of the assignment. While there may be some variation in the particular implementation of the manual validation process, efforts have been made to codify the process to ensure consistency. Key objectives for the validation process (which peaks must be attributable to the sequence, expected relative intensity of peaks for key fragments, PTM localization, and criteria for exclusion) have been nicely summarized in previous works13. Correct peptide sequences typically have all fragment ions over 10% of the base peak intensity assigned to a specific fragmentation site in the peptide. As seen in Figure 2C, unassigned fragment ions in the MS/MS spectrum suggest either an incorrect sequence or a mixed spectrum; both of which result in decreased confidence in the accuracy of the assignment. Manual validation can be tedious and time-consuming, as tens of significant peaks from each MS/MS spectrum must be individually aligned and matched to the mass of a known 90 fragment derived from the proposed peptide. When applied to a large-scale dataset, this approach can take weeks, if not months, to individually assess each of the thousands of MS/MS spectra, with most of the time spent meticulously comparing lists of numbers and relatively little time required for the final decision as to whether one should include the assignment in the dataset. There is a clear disconnect in the speed with which millions of spectra can be acquired and the time required for spectral validation. How, then, to increase the speed of the manual validation process without sacrificing the level of rigor and confidence in each peptide identification? As it turns out, many of the tasks associated with manual validation can be assisted by computer automation, leaving the task of approval of a particular peptide assignment to a human decision while expediting much of the tedious tasks of collecting relevant MS/MS spectra from a particular analysis and calculating the predicted fragment ions of a particular peptide. Here, we describe a Computer-Aided Manual Validation (CAMV) package that mitigates the time-consuming portions of the validation task and presents the relevant information in a streamlined format, allowing the user to rapidly judge the accuracy and quality of database identifications. Application of CAMV to a given LC-MS/MS analysis leads to a highconfidence set of peptide identifications and a concise summary of any quantitative information from iTRAQ or SILAC, all of which can be accomplished in hours instead of days or weeks. 2.3 Computer Aided Manual Validation Package We have produced a computer-aided validation pipeline that expedites the validation process without removing human judgment, helping to address the disconnect between manual validation and high-throughput data generation while still maintaining data quality. CAMV loads the MS/MS scans along with the putative assignment from the search engine. Fragment ions are automatically labeled based on the sequence assignment and according to a scoring rubric, which has been developed and tested to assign the most likely labels to each fragment. To assist the user in assessing the quality of the peptide assignment, additional information is provided in the output of CAMV, including color-coded peak labeling, magnified view of the mass-to-charge range of the MS scan around the precursor ion, and of the MS/MS scan around the iTRAQ marker ion mass range (Figure 2-­‐7). Color-coded peak labels allow the user to rapidly identify 91 unlabeled or mislabeled peaks, both of which are particularly valuable when confirming peptide sequence assignment or comparing multiple PTM localizations within a given peptide sequence. Through this overall design, the various aspects of manual validation software are apportioned to the most qualified entity: CAMV performs the most tedious and time consuming tasks associated with peak labeling, and the user is presented with the most relevant information to quickly make the correct decision. Once an analysis has been user- verified, publication-ready figures and spreadsheets containing the appropriate quantitative data can be generated from accepted assignments. Figure 2-­‐7: The Graphical User Interface (GUI) for the MATLAB implementation of CAMV. The left tree contains a searchable list of protein hits in order of decreasing MASCOT score, each peptide assignment made to that protein, and all possible combinations of PTMs for each scan. The middle panel contains the MS2 scan data with pre-­‐labeled peaks. On the right is a survey of the precursor window of the MS1 scan and a view of any quantification information associated with the assignment. The peptide ladder at the top summarizes the sequencing information: red for an identified fragment, black for missing. A peak color-­‐coding scheme allows for rapid surveillance of the quality of the match: within tolerance (green), within 1.5x tolerance (magenta), unmatched peak (red), and isotopic peak (yellow). Peptides which are pre-­‐emptively excluded by the software appear in red on the left-­‐hand tree. For these peptides, the reason for pre-­‐emptive exclusion is displayed at the top left of the MS2 panel in place of the sequence ladder, along with an option to proceed with processing. 2.3.1.1 Data Preprocessing The typical workflow for CAMV analysis is represented schematically in Figure 4. Here we describe the current configuration, although in most cases the specific software tool embedded in the package that has been utilized for a given task could be replaced with an alternate option. Initially, the raw mass spectrometry data file is converted into Mascot Generic Format (MGF) through DTA Supercharge, which de-isotopes the MS data files, 92 converting precursor masses to the mono-isotopic masses. The MGF file is then searched with MASCOT against the appropriate database, with the associated post-translational modification options. The results of the MASCOT search are harvested as an XML file with the input query data included, enabling one to match the MGF file query number to the raw file scan number. To access the scan information in the MS raw data file, an mzXML is produced using ‘msconvert’ from the Proteowizard package 15. DTA'Supercharge' Thermo'RAW'File' Proteowizard' MASCOT'Generic'File'(MGF)' MASCOT' mzXML' MASCOT'Search'Results'(XML)' ValidaBon'SoHware' Validated'Spectra' iTRAQ'QuanBficaBon' Figure 2-­‐8: Peptide Sequence Identification and Validation Workflow. Raw data from the instrument is converted with DTA Supercharge into an MGF file that is searchable in MASCOT2. In parallel, the Proteowizard15 package is used to convert the RAW file into a format where the scan information is readily accessible. These two paths converge on the Validation Software, which matches the search results with the relevant scan data in preparation for user verification. Once complete, figures of validated scans are exported as PDFs and quantification information as a spreadsheet. 2.3.1.2 PTM Identification Search algorithm (e.g. MASCOT) results are retrieved as an XML file containing global information about fixed and variable modifications used in the search. Fixed 93 modifications are applied to all instances of a given residue while variable modifications may or may not be present on any instance of a given residue. Scan-specific modifications to be applied to a particular MASCOT identification are included in the XML file and are taken into account when generating fragment masses; terminal fixed modifications are applied to every peptide. 2.3.1.3 Modified Sequence Generation Most of the current searching algorithms perform well at assigning the base peptide sequence and the appropriate number of modifications, but perform poorly at determining the correct site-specific localization of each modification. This poor performance is due to the number and similarity of the various theoretical fragmentation patterns, as for each amino acid sequence and set of variable modifications, several permutations may exist, each of which may be distinguished typically by differences in 2 fragment ions. To facilitate this PTM localization task, CAMV generates all permissible combinations with a recursive search. Although the system is easily modifiable to include additional modifications, in the current configuration, CAMV handles the following modifications: Fixed: • iTRAQ labeling • Cysteine Carbamidomethylation Variable: • Serine, Threonine, and Tyrosine Phosphorylation • Lysine Acetylation (concurrently with iTRAQ) • Methionine Oxidation • Light, Medium, and Heavy SILAC Labeling of Arginine and Lysine (not concurrently with Lysine Acetylation or iTRAQ) 2.3.1.4 Theoretical Fragment Ion Generation For each candidate permutation of variable modifications, the full set of theoretical fragment ion masses is generated. To calculate these masses, the N-terminal mass is determined based on the type of iTRAQ (e.g. none, 4-plex, 8-plex) applied to the sample. Each residue is added to the N-terminus in the proper order, with all resulting b-ion masses recorded. With the full sequence, the precursor ion m/z values of charge states 1-5 94 are calculated and stored. Next, all permissible combinations of neutral losses from each b-ion are calculated and stored. Permissible losses include: H2O, NH3, H3PO4 (from pSer or pThr), HPO3 and HPO3+H2O (from pTyr), and SOCH3 (from carbamidomethyl C), this list can be expanded as needed, but it is important not to expand to include unlikely losses which can lead to over-fitting of the data (e.g. by assigning unlikely fragment ions to the peaks, it becomes more difficult to differentiate false-positives from true positives). Each b-ion and loss is calculated for charge states up to and including the charge state of the precursor. Each b-ion and loss also produces an a-ion fragment with the loss of CO2. The process is repeated from the C-terminus of the peptide to produce y-ions and the corresponding losses. Additional frequently observed fragment ions are included: 216.04 (for pTyr) and bn+86 (for 8plex iTRAQ). 2.3.1.5 Fragment Alignment CAMV attempts to assign a label to all peaks in the MS/MS spectra whose intensity is greater than 10% of the base peak intensity. To probe further into sparse regions of the MS/MS spectrum, label assignments are attempted on all peaks whose intensity exceeds 2.5% of the base peak intensity if they are a local maximum in a sparsely populated region (e.g. +/- 25 m/z). The accuracy of the fragment ion matching thresholds can be adjusted for high- vs. low-resolution fragment ion spectra. The current configuration is set for validation of low-resolution, linear ion trap CID spectra. With this setting, predicted fragment masses that are within 1000 ppm (0.1% of the m/z) of the observed peak are candidate matches that are denoted with a label and a green asterisk. Predicted fragment ion masses that are within 1500 ppm (0.15% of the m/z) of the observed peak may also be labeled and accompanied by a magenta asterisk to indicate the lower confidence assignment. Isotope peaks of an identified peak are labeled with a yellow asterisk. It is worth noting that it is straightforward to change the thresholds for these matches to accommodate high mass accuracy, high resolution MS/MS data from a timeof-flight or orbitrap mass analyzer. Peaks in the MS/MS spectrum may match to multiple different theoretical fragment ions. To highlight the most likely match, we have developed a scoring rubric; the optimal label for a given peak is assigned based on the highest score in the following rubric: 95 • +12 for precursor fragments • +10 for b- or y-series ions • -1 for each neutral loss • +0.5 for closest match to the observed mass With this rubric, we have tried to capture the most commonly occurring fragment ions associated with correct peptide identifications. Depending on the collision energy used to drive fragmentation, the most abundant ions in the MS/MS spectrum are typically those associated with fragmentation of the peptide backbone (e.g. y- and b-type ions) and neutral losses from the precursor ion. These candidates are therefore given the highest scores in the above rubric. Each neutral loss from these main fragmentation events becomes less likely so fewer points are awarded to these candidate fragment ion labels. Effectively, if the peptide assignment is correct, then abundant ions in the MS/MS spectrum are more likely to result to be a b- or y-type ion than to be a b- or y-type ion that has undergone 4 for 5 neutral losses. In the current configuration of CAMV, the total points associated with a spectrum are not considered in the final decision as to the quality of the peptide assignment. However, it is conceivable that the number of high-scoring vs. low-scoring fragment ion assignments could be reported as another metric to assist the user in their evaluation of each spectrum. Further color-coding to separate high- and low-scoring peaks is another option that could be considered in future versions of CAMV. From an extensive amount of manual and computational analysis of true- vs. false-positives, the number of unlabeled abundant fragment ions is often critical in determining false-positives16. Therefore unlabeled abundant fragment ions, as defined by observed peaks that do not have any predicted fragments within 1500 ppm, are labeled with a red circle. 2.3.1.6 Label Renaming Since the scoring rubric does not consider all factors in assigning a label to a given peak in the MS/MS spectrum, the user may change the assigned label. Clicking on an assigned label will display a list of predicted fragment ions that match within tolerance, allowing another label to be chosen. It is also possible to select a label outside of the tolerance window, or outside of the user-defined fragmentation options listed above. This user- 96 defined label will be applied to the peak, but no mass will be calculated and it will be shown with a black asterisk. 2.3.1.7 Quantitative Accuracy and MS Scan windows Importantly, CAMV also allows the user to assess the quantitative accuracy of isotopic labeling strategies. For instance, by providing an image of the MS spectrum in the m/z region of the precursor ion isolation window (right side, Figure 2-­‐7), users can determine whether SILAC peaks might have overlapping contaminant peaks, as can occur in complex mixtures, and then select whether the quantification of this pair should be included in the final dataset. On a similar note, for iTRAQ quantification, the user can determine whether another ion above a given intensity threshold was present in the isolation window and may therefore have altered the iTRAQ marker quantification values. The size of the isolation window highlighted by the gray box in this image can be adjusted by the user according to the settings used for data collection. The image of the MS spectrum in the m/z region of the precursor ion isolation window also allows the user to evaluate whether the charge state of the precursor ion was assigned correctly, an important consideration for low-level precursor ions in complex mixtures. 2.3.1.8 Validating Spectra The tree on the left-hand side of the GUI contains proteins rank-ordered by their respective MASCOT scores and peptides for each protein listed alphabetically (first) and numerically (second) in order of the MS/MS spectrum match (Figure 2-­‐7). For each peptide matched to an MS/MS spectrum, nested under each entry is an assignment with the appropriate number of modifications, initially marked by a filled gray circle. After evaluating the peptide assignment based on all of the above criteria, the user may select from one of three options located on the bottom right of the MS2 window: “Accept”, “Maybe”, and “Reject”. The choice is recorded, so that the peptide assignments can be sorted into lists for future reference, and displayed graphically by changing the color of the filled circle in front of the assignment, allowing the user to keep track of their progress. A search feature is included to locate peptide identifications based on scan number or protein name. 97 2.3.1.9 Exporting Validated Spectra Two buttons on the bottom right of the GUI export PDF figures. “Print Accept List” will print all accepted assignments to the “output\run_name\accept\” folder. Similarly, “Print Maybe List” does the same for all assignments marked “Maybe”. 2.3.1.10 Exporting Quantification iTRAQ or SILAC quantification for all peptides marked “Accept” will be added to an Excel spreadsheet in the “output\run_name\” folder simultaneously when the “Print Accept List” button is pressed. 2.3.1.11 Preemptive Exclusion To facilitate manual validation, many spectra are pre-emptively removed from peak assignment to reduce the time spent on enumerating and preprocessing large numbers of sequences that will never be accepted. The criteria for pre-emptive exclusion include: • Excessively long peptide sequence • Low database score • Excessive number of PTM permutations • Excessive number of peaks to identify • Incomplete SILAC labeling • Poor MS1 data • Contaminated precursor window (user defined) The heuristics for most of these criteria are user-definable in CAMV, and can be altered depending on the experimental conditions. For instance, longer peptides are produced by proteolytic digestion with some enzymes, and therefore the ‘excessively long peptide sequence’ and ‘excessive number of peaks to identify’ settings might need to be adjusted. An important feature of the CAMV is that these pre-emptively removed spectra are still listed in the output (see Figure 2-­‐7), along with the reason(s) why the spectra were not processed. For each of these spectra, the user can click the link, evaluate the reason for removal, and request processing of the spectra. The spectra and peak labels are then displayed for manual validation. 98 2.3.2 Exemplary Application to Tyrosine Phosphorylation Analysis To evaluate the application of CAMV to facilitate manual validation of MS/MS spectral assignments, we selected an example data set that had already been manually curated, thereby providing a benchmark in terms of MS/MS spectral assignment and speed of analysis. The particular mass spectrometry dataset chosen was a quantitative analysis of tyrosine phosphorylation in glioblastoma patient tumor xenografts, where accuracy in terms of phosphorylation site identification and quantification were critical for determining the signaling pathways in these tumors17. Since the sample preparation involves a 2-stage enrichment for tyrosine phosphorylated peptides from a small amount of starting material, the sample complexity is fairly low, and the total number of MS/MS spectral assignments above a given threshold are less than one thousand. As expected, CAMV radically accelerated the process while minimally impacting the quality of the final dataset. In fact, the main benefit to the user is a drastic reduction in the time necessary to validate an analysis. Previously, the manual validation process would require days to weeks depending on the complexity of the sample. With CAMV, the preprocessing step (not including database search) takes less than an hour; often, it is only a matter of minutes for phosphotyrosine-enriched iTRAQ-labeled or SILACencoded samples. The user-friendly interactive interface with color-coded peak labels makes decision-making on the various spectra fairly straightforward, and the total time required for the analysis was reduced from days/weeks to hours. Although there is still the need for hours of user time to validate the data, the result is a high confidence dataset with minimal false positives, and virtually all of the user interaction time is focused on decision-making rather than tedious table comparisons and arithmetic calculation. 99 Figure 2-­‐9: All peptide matches following user validation of the dataset. Color code: Agreement between algorithm and user 201/317(green), alternate identification chosen by user 4/317 (yellow), additional assignment included by user 79/317 (blue), and assignment rejected by user 33/317 (red). 100 Figure 2-­‐10: Distribution of user and CAMV decisions versus MASCOT score. Note that in many cases the user rescued the spectra that were pre-­‐emptively excluded by the software, while in other cases the user rejected the spectral assignment based on the poor quality of the match. The statistics from the final data set resulting from user validation assisted by the CAMV are highlighted in Figure 2-­‐9. The final set contained 284 scans, 201 of which were passed all of the internal criteria and were also selected as high confidence assignments by the user (Figure 2-­‐9 green, also see Figure 2-­‐4a). Of the 284 spectra in the final data set, the algorithm pre-emptively excluded 79 scans based on one or more of the criteria listed above (Figure 2-­‐10 blue, also see Figure 2-­‐4b). For many of these 79 spectra, the individual scan Mascot score was below 25, or there were too many peaks to be identified. The heuristics underlying the pre-emptive exclusion decision are userdefinable and can be altered in the future to tailor the number of exclusions based on experimental conditions and dataset quality. Because the GUI lists both accepted and pre-emptively excluded assignments, it was fairly straightforward to rapidly check the quality of the excluded spectra and rescue those that were high-confidence matches. 101 These false negatives were reintroduced to the dataset following user validation. Four scans (Figure 2-­‐9 yellow, also see Figure 2-­‐5) were assigned to an alternate PTM state by the user; although technically not false-positives, the site of modification was incorrectly assigned by our scoring metric. Additionally, there were 33 false positive identifications (Figure 2-­‐9 red, also see Figure 2-­‐6) where the user did not find a sufficient match to the sequence identified by MASCOT and therefore removed the spectra from the final dataset. The breakdown of accepted and rejected MS/MS spectra versus MASCOT score is shown in Figure 2-­‐10 for the manually validated and CAMV software. Note that in each bin there were some spectra that were initially rejected that were rescued by the user and some spectra that would have been accepted based on a threshold scoring that were rejected by the user during the manual validation process. At the end of the analysis, despite a fairly rigorous initial threshold, CAMV required less than a day and removed approximately 10% false positives, based on low confidence in the MACOT-identified spectral assignments. 2.4 Conclusion CAMV is a software package to aid the manual validation process of MSMS peptide identification data. This software package has drastically improved a tedious and timeconsuming task that vastly exceeded the sample analysis time, creating a backlog in the workflow of many projects. By partially automating this process, we have shifted the human focus to the decision-making portion of the task, allowing user judgment to be rapidly applied. We hope that CAMV will provide researchers with a streamlined way to perform their manual validation without having to rely solely upon false discovery rates when analyzing and reporting their data. 102 2.5 Appendices 2.5.1 Package Availabilty The CAMV Software Package is freely available at http://web.mit.edu/icbp/data/ 2.5.2 Version Information All analyses were performed with the following software versions: Proteowizard: Release 2.0.1749 (used for included data), 3.0.4323 (update included with CAMV distribution) Thermo XCalibur: 2.1.0 SP1.1162 DTA Supercharge: 2.0b1 (part of MSQUANT 2.0b1) MATLAB: 7.13.0.564 (R2011b) MATLAB MCR: 7.16 Mascot: Release 2.1.03, additional support for 2.4.1 2.5.3 Mascot Search Parameters Database: Human-2009 (37743 sequences; 17175626 residues) Enzyme: Trypsin Fixed modifications: iTRAQ8plex (K), iTRAQ8plex (N-term), Carbamidomethyl (C) Variable modifications: Oxidation (M), Phospho (ST), Phospho (Y) Mass values: Monoisotopic Protein Mass: Unrestricted Peptide Mass Tolerance: ± 10 ppm Fragment Mass Tolerance: ± 0.8 Da Max Missed Cleavages: 2 Instrument type: Josh-a-ion-LTQ Number of queries: 14406 103 2.6 Bibliography 1. Yates, J. R., Eng, J. K., McCormack, A. L. & Schieltz, D. Method to correlate tandem mass spectra of modified peptides to amino acid sequences in the protein database. Anal. Chem. 67, 1426–1436 (1995). 2. Perkins, D. N., Pappin, D. J., Creasy, D. M. & Cottrell, J. S. Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20, 3551–3567 (1999). 3. Craig, R., Cortens, J. C., Fenyo, D. & Beavis, R. C. Using annotated peptide mass spectrum libraries for protein identification. J. Proteome Res. 5, 1843–1849 (2006). 4. Cox, J. et al. Andromeda: a peptide search engine integrated into the MaxQuant environment. J. Proteome Res. 10, 1794–1805 (2011). 5. White, F. M. The potential cost of high-throughput proteomics. Sci Signal 4, pe8 (2011). 6. Käll, L., Storey, J. D., MacCoss, M. J. & Noble, W. S. Assigning significance to peptides identified by tandem mass spectrometry using decoy databases. J. Proteome Res. 7, 29–34 (2008). 7. Elias, J. E. & Gygi, S. P. Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat Meth 4, 207–214 (2007). 8. Klammer, A. A., Reynolds, S. M., Bilmes, J. A., MacCoss, M. J. & Noble, W. S. Modeling peptide fragmentation with dynamic Bayesian networks for peptide identification. Bioinformatics 24, i348–56 (2008). 9. Martin, D. M. A. et al. Prophossi: automating expert validation of phosphopeptidespectrum matches from tandem mass spectrometry. Bioinformatics 26, 2153–2159 (2010). 10. Beausoleil, S. A., Villén, J., Gerber, S. A., Rush, J. & Gygi, S. P. A probabilitybased approach for high-throughput protein phosphorylation analysis and site localization. Nat. Biotechnol. 24, 1285–1292 (2006). 11. Taus, T. et al. Universal and confident phosphorylation site localization using phosphoRS. J. Proteome Res. 10, 5354–5362 (2011). 104 12. Savitski, M. M. et al. Confident phosphorylation site localization using the Mascot Delta Score. Molecular & Cellular Proteomics 10, M110.003830 (2011). 13. Nichols, A. M. & White, F. M. in Methods In Molecular Biology 492, 143–160 (Humana Press, 2009). 14. Zhang, J. et al. MS/MS/MS reveals false positive identification of histone serine methylation. J. Proteome Res. 9, 585–594 (2010). 15. Kessner, D., Chambers, M., Burke, R., Agus, D. & Mallick, P. ProteoWizard: open source software for rapid proteomics tools development. Bioinformatics 24, 2534– 2536 (2008). 16. Lahesmaa-Korpinen, A.-M., Carlson, S. M., White, F. M. & Hautaniemi, S. Integrated data management and validation platform for phosphorylated tandem mass spectrometry data. Proteomics 10, 3515–3524 (2010). 17. Johnson, H. et al. Molecular characterization of EGFR and EGFRvIII signaling networks in human glioblastoma tumor xenografts. Molecular & Cellular Proteomics 11, 1724–1740 (2012). 105 106 3 Multiplex Absolute Regressed Quantification of Internal Standards (MARQUIS) 3.1 Introduction Innovations over the past decade in methodology and instrumentation have greatly increased the capability of mass spectrometry to identify a significant fraction of the proteome, including thousands of post-translational modifications (PTMs)1. Through the use of quantitative strategies, including label-free proteomics, metabolic stable isotope labeling, and chemical stable isotope labeling, it is possible to analyze two or more biological samples and generate relative quantification data detailing differences in protein expression or modification across different cellular conditions2,3. Multiplex labeling has effectively improved the throughput of the approach, enabling the comparison of multiple different biological samples in a single analysis, with quantification typically determined relative to a basal condition or pooled standard. Multiplexed relative quantification has been useful for elucidating temporal dynamics of phosphorylation signaling following growth factor stimulation, with changes in protein PTM levels in stimulated vs. non-stimulated conditions highlighting pathways involved in signal processing and cellular response4. When combined with bioinformatic algorithms, such as clustering, relative quantification of PTM level has enabled researchers to predict the function of poorly characterized phosphorylation sites5. Through quantification of downstream biological response and statistical analysis, key sites have been identified as regulators of observed cellular behavior, providing further functional and phenotypic association for novel phosphorylation sites6. Despite the ability to generate biological insight with a combination of relative quantification of phosphorylation dynamics and statistical modeling or bioinformatics, additional information is encoded in the absolute levels of site-specific phosphorylation. For instance, relative fold change of a signal carries limited information about the network’s basal state or response to stimuli, as a two-fold change from a high basal state (e.g. 30% to 60% activation) can lead to a very different biological response from a twofold change from a very low basal state (e.g. 1% to 2% activation). Absolute quantification also enables comparison between multiple phosphorylation sites on a given 107 protein under different conditions. Since each phosphorylation site might recruit adaptor proteins associated with particular signaling pathways, absolute quantification of phosphorylation can provide critical data relating receptor stimulation and pathway activation. Absolute quantification data is not typically available from standard mass spectrometry, western-blot, or reverse-phase protein array experiments, as the calibration curve is unique to each phosphorylated peptide due to different response profiles (e.g. ionization potential or antibody binding affinity). Therefore multiple methods for estimating absolute quantification have been established. Among these, one of the most common MS-based methods is isotope dilution, or AQUA, in which a synthetic isotope-encoded peptide is added to the sample pre-analysis, and quantification is based on the relative peak heights or area under the chromatographic elution curve for the endogenous and synthetic peptide standard7. Although this technique is fairly straightforward, it relies on single point calibration and may provide erroneous estimates if the endogenous peptides span a large dynamic range across multiple conditions, or if the concentrations of these peptides fall outside of the linear response range of the instrument. Alternate methods for absolute quantification include the generation of standard curves through separate analyses of standard peptides at known concentration7. Since the complex background of the biological context can significantly alter the signal intensity of the endogenous peptide due to competition for charge in the ionization process in MS, comparison to a standard curve generated in a neat background can lead to significant estimation errors. One solution would include different concentrations of a standard peptide into the sample so that the standard and endogenous peptides experience the same local sample context, but then one is faced with the issue of differentiating the standards from the endogenous peptides. Ideally, one would like to combine the multiplexed capabilities of chemical labeling with standard curves internal to the sample, thus allowing the accurate, absolute quantification of given peptides across multiple biological conditions within a single analysis. With these goals in mind, we have developed Multiplex Absolute Regressed Quantification of Internal Standards (MARQUIS), an MS-based technique to measure the 108 absolute amount of modification on many post-translational sites of interest simultaneously from multiple samples within a single analysis. The method utilizes multiple doses of internal standards to bracket a broad abundance range allowing simultaneous analysis of treatment conditions expected to produce widely varied endogenous signal magnitudes. We applied MARQUIS to quantify the absolute amount of phosphorylation on several sites in the Epidermal growth factor receptor (EGFR) network at different time points following stimulation of EGFR with different ligands. Absolute quantification of phosphorylation dynamics in this system highlights novel wiring within the network, insight that was not previously available from relative quantification data. MARQUIS enables the acquisition of accurate absolute quantification data, is applicable across multiple instrument platforms, and is equally applicable to protein expression profiling and PTM quantification. 3.2 Methods Sample Preparation: MCF10A cells were serum-starved 24 hours prior to stimulation with 20nM EGF (Peprotech) for 0.5, 1, 2, 3, 5, 10 or 30 minutes. Media was aspirated and cells were lysed with 2ml of cold 8M urea with 1mM activated sodium orthovanadate. Lysate was frozen at -80 until further processing. Protein yield was quantified by BCA assay (Pierce) in order to ensure equal loading. Standard peptides were added to samples as in Supplementary Table 1. Samples were reduced with 40µl 10mM DTT in ammonium acetate pH 8.9 for 1 hour at 56 C. Samples were alkylated with 55mM iodoacetamide in ammonium acetate pH 8.9 for 1 hour at room temperature. 8mL ammonium acetate and 40µg of sequencing grade trypsin was added prior to 16 hours incubation at room temperature. Lysates were acidified with 1 mL glacial acetic acid and protein was purified with Waters Sep-Pak columns. Samples were lyophilized and subsequently labeled with iTRAQ 8plex (AbSciex) per manufacturer’s directions. Standard Peptide Preparation and Quality Assurance: Standard peptides were synthesized at the Koch Institute Biopolymers facility and amino acid analysis was performed to measure concentration by AAA Service Laboratories (Damascus, OR). Synthetic peptides were analyzed by LC MS/MS to confirm sequence and purity. 109 3.2.1 Multiple Reaction Monitoring (MRM): Immunoprecipitation: 70ul protein G agarose beads (calbiochem IP08) were rinsed in 400ul IP Buffer (100mM tris, 0.3% NP-40, pH 7.4) prior to an 8 hour incubation with a cocktail of three phosphotyrosine-specific antibodies (12µg 4G10 (Millipore),12ug PT66 (Sigma), and 12ugPY100 (CST)) in 200ml IP Buffer. Antibody cocktail was removed and beads were rinsed with 400ul of IP Buffer. Labeled samples were resuspended in 150ul iTRAQ IP Buffer (100mM Tris, 1% NP-40, pH 7.4) + 300ul milliQ water and pH was adjusted to 7.4 (with 0.5M Tris HCl pH 8.5) prior to addition to prepared beads and 16 hour incubation. Supernatant was removed and beads were rinsed 3 times with 400ul Rinse Buffer (100mM Tris HCl, pH 7.4). Peptides were eluted in 70ul of Elution Buffer (100mM glycine, pH 2) for 30 minutes at room temperature. Immobilized Metal Affinity Chromatography (IMAC) Purification: A fused silica capillary (FSC) column (200µm ID x 10cm length) was packed with POROS 20MC beads (Applied Biosystems cat.no. 1-5429-06). Column was rinsed with 100 mM EDTA pH 8.9 in ultrapure water to remove residual iron and then rinsed with ultrapure water to remove EDTA before fresh iron was deposited by rinsing with 100 mM FeCl3 in ultrapure water. Excess iron was removed with 0.1% acetic acid prior to loading the IP elution. The column was washed with 25% MeCN, 1% HOAc, 100 mM NaCl in ultrapure water to remove non-specifically bound peptides and then rinsed with 0.1% HOAc. Peptides were eluted with 250mM NaH2PO4 in ultrapure water and collected on a pre-column (100µm ID x 10cm packed with 10µm C18 beads (YMC gel, ODS-A, 12nm, S-10 µm, AA12S11)). This pre-column was rinsed with 0.2M acetic acid prior to LCMS analysis. Liquid Chromatography Mass Spectrometry: The pre-column was connected in-line with an analytical column (50µm ID x 10cm packed with 5µm beads (YMC gel, ODSAQ, 12nm, S-5 µm, AQ12S05)) with integrated electrospray emitter tip [Martin 2000]. A 140 minute gradient from aqueous (0.2M acetic acid) and organic phase (0.2M HOAc, 70% MeCN) was used to separate peptides. Absolute Quantification: Analyses were performed on a Thermo TSQ Quantum in MRM mode. Endogenous and heavy-labeled versions of each peptide of interest were 110 tracked throughout the experiment with two sequence specific transitions (precursor/prominent b or y ion) and eight iTRAQ transitions (precursor/label peak). Elution time was determined based on coelution of iTRAQ transitions with sequence specific transitions for both standard and endogenous peptides of the MARQIS pair. Area Under the Curve (AUC) was collected for each transition. Data were corrected by iTRAQ isotope purity and synthetic peptide loading control. A calibration curve was created for each peptide to relate the AUC of the iTRAQ transition to the amount of standard included. MATLAB function lscov was used with the intensity of the signal also serving to appropriately scale the variance. This calibration curve was then used to calculate the amount of peptide present in each of the endogenous conditions based on the corresponding iTRAQ transition AUC. 3.2.2 Parallel Reaction Monitoring (PRM): Immunoprecipitation: 20 µl protein G agarose beads (Calbiochem, IP04-1.5ML) were pre-incubated with 20µl of PT66 and 20µl of pY100, prior to adding sample (reconstituted in 70µl HEPES buffer adjusted with 5M NaOH). After overnight incubation, beads were rinsed and peptides were eluted with 20µL of 0.1% TFA, 10% MeCN, 250ng/µL HeLa Digest (phospho depleted). NTA Enrichment: Elution incubated with Fe(III)-NTA magnetic beads for 15 minutes, rinsed and eluted with 10µL of NH4OH (1:30), 5mM EDTA, 5% MeCN, 1ng/µl HeLa digest. Elution was transferred to an autosampler vial and pH was adjusted with 5µl of 10% TFA. Liquid Chromatography Mass Spectrometry: LC-MS was performed using a 25cm, 75µm ID, EASY-Spray column (ThermoFisher Scientific, ES802) with a 120 minute gradient and a flow rate of 300µl/min. Absolute Quantification: Analyses were performed on a Thermo Q-Exactive. Endogenous and heavy-labeled versions of each peptide were targeted for MS2 analysis during their expected elution times. In the MS1 scan, the total amount of endogenous peptide was calculated based on the fraction of MS1 integrated intensity when compared to the standard peptide. This amount was then apportioned across the contributing 111 samples based on relative iTRAQ intensity. A single point measurement where the MS2 scan used to determine the iTRAQ intensity ratios was chosen based on maximal MS1 intensity to assure maximal signal to noise ratio. 3.3 Results 3.3.1 Assessing the requirement for standard peptides for absolute quantification of tyrosine phosphorylation Theoretically, absolute quantification of protein expression or phosphorylation could be calculated directly from MS data without internal standards by quantifying the signal amplitude for the peptide measured in the full scan mass spectrum or from fragment ion transitions. To assess the need for internal standards, we selected several tyrosine phosphorylation sites in the EGFR signaling network, generated synthetic phosphorylated peptides encompassing these sites, and added these peptides at defined concentrations to cell lysate of MCF10A cells. Quantification of the signal intensity for each peptide across multiple different analyses demonstrated a significant amount of run-to-run variation (three representative examples are presented in Figure 3-­‐1), likely associated with small flow rate fluctuations combined with differences in temporal sampling across the chromatographic elution profile. b)# 6.E+07' 6.E+07' 6.E+06' 6.E+06' 6.E+05' c)# Src'pY418' Signal' Signal' EGFR'pY1173' 3.E+07' 6.E+05' 3.E+06' 3.E+05' 6.E+04' 6.E+04' 6.E+03' 1.E+00' 1.E+01' 1.E+02' 1.E+03' 1.E+04' 6.E+03' 1.E+00' 1.E+01' 1.E+02' 1.E+03' 1.E+04' Amount'(fmol)' Shc'pY317' Signal' a)# Amount'(fmol)' 3.E+04' 1.E+00' 1.E+01' 1.E+02' 1.E+03' 1.E+04' Amount'(fmol)' Figure 3-­‐1: Variability of Signal Intensity from technical replicates. The signal intensity vs. amount of peptide is plotted for three representative peptides across five technical replicate analyses. To quantify this data across multiple concentrations of each peptide, we calculated the response ratio, the ratio between signal amplitude and peptide concentration, for 13 peptides in the EGFR signaling network (Figure 3-­‐2). As the data demonstrates, for a given concentration, the response ratio can vary by two to three orders of magnitude. This 112 variation is likely associated with differences in ionization potential of the peptides and their respective co-eluting species, due to competition for charge in electrospray ionization. Together, the broad range in peptide-specific response ratio and the technical replicate variation make it difficult to directly relate signal intensity and abundance, thereby highlighting the need for an improved method to generate accurate absolute quantification data. Figure 3-­‐2: Peptide Specific Signal Intensity. The average integrated intensity vs. amount of all heavy-­‐ labeled standard peptides. The large range is due to sequence specific effects compiled throughout sample processing, liquid chromatography, and ionization. 3.3.2 Multiplex Absolute Regressed Quantification of Internal Standards (MARQUIS) To quantify the absolute levels of endogenous peptides across a broad dynamic range, we have developed the MARQUIS (Multiplex Absolute Regressed Quantification of Internal Standards) method. Absolute quantification in MARQUIS is based on comparison of the endogenous peptides to paired heavy-isotope labeled internal standards added at specific concentrations into multiple biological samples (Fig. 2). Briefly, synthetic peptides containing a heavy-isotope labeled amino acid were generated for each phosphorylated 113 peptide of interest and stock concentrations were established by amino-acid analysis (AAA). A standard peptide cocktail was then added to cell lysates; one concentration per lysate. These lysates were purified and digested prior to stable isotope labeling (iTRAQ) and combination. The mix of endogenous and synthetic peptides was subjected to targeted tandem mass spectrometry for iTRAQ quantification and peptide sequence confirmation. Fragmentation of the iTRAQ-labeled synthetic, heavy-labeled phosphopeptide produced an internal standard curve with each calibration point corresponding to the unique amount of synthetic peptide added to the corresponding biological sample. Similarly, fragmentation of the iTRAQ-labeled endogenous peptide also produced iTRAQ marker ions, with the intensity of each ion corresponding to the amount of endogenous peptide in each biological sample. Calibration of the iTRAQ marker ion intensities from the endogenous peptide against the standard curve provided absolute quantification for each biological condition. The internal standard curve establishes the linear dynamic range around the endogenous peptide concentration, providing an accurate intrinsic quality assessment of the peptide quantification. Since synthetic peptides are added early in the processing workflow, losses are identical for the endogenous and synthetic peptides as both experience identical environments through all stages of preparation. Moreover, since the endogenous and heavy-labeled peptides co-elute, the peptides experience identical chromatographic and ionization contexts, removing an additional potential source of quantification error. 114 P" P" P" P" P" P" P" P" P" P" P" P" P" P" P" P" P" P" Y" LC&MS" Y" MS" pY"IP" P" P" P" MS/MS" P" P" P" Y" Y" Y" Y" Figure 3-­‐3: MARQUIS Method for Absolute Quantification. Synthetic isotope-­‐labeled phosphorylated peptides are added during cell lysis. Following sample processing and digestion, all peptides, synthetic and endogenous, are iTRAQ labeled. For this particular application of the method, tyrosine phosphorylated peptides are enriched by peptide immunoprecipitation followed by immobilized metal affinity chromatography (IMAC) and analyzed by LC-­‐MS/MS. Heavy-­‐labeled standards provide an internal calibration curve, enabling absolute quantification of endogenous peptides. 3.3.3 Performance of the method To assess the performance of the MARQUIS method, heavy-labeled synthetic phosphorylated peptides were generated for selected tyrosine phosphorylation sites in the EGFR signaling network. Synthetic peptides were added to EGF-stimulated MCF10A cell lysate, thereby providing a complex, biologically-relevant background for testing the response profile, linear dynamic range, and sensitivity of this method. MARQUIS was initially evaluated on a triple quadrupole instrument operating in multiple reaction monitoring (MRM) mode, with 10 precursor-fragment ion transitions per peptide, including 2 sequence specific and 8 iTRAQ marker ion transitions (Figure 3-­‐4 and Appendix Table 3-­‐1). It is important to note that the number of phosphorylation sites able to be monitored with this approach is inherently limited by the number of transitions. With 10 transitions per peptide and 2 peptides (synthetic and endogenous) per phosphorylation site, the total number of transitions increases rapidly as more sites in the 115 network are monitored. Increasing the number of transitions translates to fewer measurements per peptide across the chromatographic elution profile, leading to suboptimal quantitative data. Figure 3-­‐4: MRM Quantification of endogenous tyrosine phosphorylated peptides. Raw iTRAQ intensities for endogenous peptides (columns E-­‐L) and corresponding heavy-­‐labeled standard peptides (columns M-­‐T) across technical replicate analysis of EGF stimulated MCF10A cells. Calculated amounts of endogenous peptides based on regression to the standard curve are contained in columns U-­‐AB. To address this issue, we also evaluated MARQUIS on an Orbitrap-based instrument (QExactive) using targeted tandem mass spectrometry, also known as parallel reaction monitoring (PRM). In this approach, a full scan, high-resolution MS spectrum is acquired followed by full scan, high-resolution MS/MS spectra containing both iTRAQ marker ions and sequence specific peptide fragmentation for each peptide (Figure 3-­‐5). Since the endogenous and heavy-labeled peptides are both detected in the MS scan, quantitative comparison of the precursor ion intensities provides an estimate of the total amount of endogenous peptide present in the sample (Appendix Table 3-­‐2). The relative iTRAQ marker ion intensities from the MS/MS spectrum for the endogenous peptide then apportion this total amount across the biological samples. iTRAQ marker ion intensities from the MS/MS of the standard peptide serve as an important control by providing a 116 peptide-specific estimate of the accuracy and linear dynamic range. For both the MRM and PRM methods, we selected 13 phosphorylation sites to assess the performance of MARQUIS quantification at different peptide concentrations (Appendix Table 3-­‐3). Figure 3-­‐5: Details of PRM Data Collection Method. a) Precursor masses for the endogenous (red) and standard (blue) Shc pY317 peptide can be individually isolated for subsequent fragmentation. b) The full scan MS2 result of individually isolated and fragmented peptides; endogenous (red), standard (blue). The iTRAQ reporter ion region has been magnified to illustrate the relative quantification data available from the full MS2 scan. Figure 3-­‐6a-b show representative data extracted for two standard peptides using MRM and PRM. At the high end of the curve, both methods perform identically. For EGFR pY1173 (Figure 3-­‐6a), the linear range for both MRM and PRM extends from 3000 fmol to 100 fmol; between 100 fmol and 3 fmol the response remains linear, but deviates from the theoretical response due to additional noise in the experimental measurement. For Shc pY317 (Figure 3-­‐6b) the PRM method performs well with a dynamic range of approximately 1000-fold (3000 fmol to 3 fmol). The MRM method for this site did not perform as well, with suspected contamination on the iTRAQ-118 (10 fmol) channel boosting the recorded intensity significantly. This contamination does not appear to affect the iTRAQ-119 (3 fmol) or iTRAQ-121 (1 fmol) channels as both show measurably 117 lower values than iTRAQ-117 (30 fmol). Contamination may appear in the MRM method and not the PRM method due to the ability of the high resolution PRM method to distinguish contaminant ions from iTRAQ marker ions, whereas the MRM measurement detects the number of ions passing a set bandwidth in the third quadrupole. It is worth noting that altering the dwell time per transition for MRM or the ion target value and maximum fill time for the full scan PRM method can affect the sensitivity of the analysis and thereby improve the linear dynamic range, but increasing these values comes at the expense of cycle time. To evaluate the reproducibility of quantification using the MARQUIS method, we performed technical replicates of each experiment on the MRM and PRM platforms and plotted the concentration vs. coefficient of variation (CV) for each measurement in the data set (Figure 3-­‐6c). Although both methods provide high quality data (e.g. most measurements have CV <0.15) across the entire range, the most reproducible measurements are obtained above 100 fmol, with CV values <0.1. At the low end of the concentration range, some peptides display CV >0.5. For MRM, the early time points of ERK1 and ERK2 signals exhibited the highest CV, likely due to the higher standard doses included in the corresponding channels. The highest CV’s for the PRM technique were those of the dual phosphorylated EGFR peptide pY1045/pY1068. This peptide has a lower response ratio (Figure 3-­‐2) and elutes very late in the chromatographic program, making it more susceptible to contamination from background chemical noise. Overall, both techniques were comparable in terms of the fraction of measurements that met CV thresholds (Figure 3-­‐6d) at a CV of 0.20. 118 b)# EGFR$pY1173$ 100$ 10$ Expected$ 1$ MRM$ 0.1$ PRM$ Percentage$of$Total$Signal$ Percentage$of$Total$Signal$ a)# Shc$pY317$ 100$ 10$ Expected$ 1$ MRM$ 0.1$ PRM$ 0.01$ 0.01$ 3000$ 1000$ 300$ 100$ 30$ 10$ 3$ 3000$ 1000$ 300$ 100$ 1$ 0.7$ MRM$ 0.6$ PRM$ d)# FracHon$below$CV$Threshold$ c)#0.8$ 0.5$ CV$ 30$ 10$ 3$ 1$ Amount$(fmol)$ Amount$(fmol)$ 0.4$ 0.3$ 0.2$ 0.1$ 0$ 1.2$ 1$ 0.8$ 0.6$ MRM$ 0.4$ PRM$ 0.2$ 0$ 1$ 10$ 100$ 1000$ Amount$(fmol)$ 10000$ 0$ 0.2$ 0.4$ 0.6$ 0.8$ 1$ CV$ Figure 3-­‐6: Quality Comparisons. Observed vs. Expected total iTRAQ signal contribution for MRM and full-­‐scan methods for two sites: EGFR pY1173 (a) and Shc pY317 (b). c) Coefficient of Variation vs. Amount for MRM and Full Scan QE methods. d) Fraction of measurements that meet Coefficient of Variation (CV) thresholds for both techniques. 3.3.4 Isolation Width Effects In the MARQUIS method, both the endogenous and the heavy-labeled synthetic peptides are iTRAQ labeled, allowing for direct comparison of iTRAQ marker ion intensities resulting from fragmentation of the corresponding peptides. This approach has the potential for error associated with contamination of the isolation window from other endogenous or synthetic peptides, or potentially from co-isolation of the endogenous and heavy-labeled peptides. As the charge state of the peptide increases, the separation, in m/z, between the endogenous and heavy-labeled synthetic peptide decreases, thereby increasing the likelihood of co-isolation. To assess the effect of the isolation width on MARQUIS quantification accuracy for the MRM platform, we narrowed the isolation width on the first quadrupole from 0.7m/z to 0.1m/z. Narrowing the MS1 isolation window can decrease the level of contamination in the ions selected for fragmentation, but also significantly decreases the transmission efficiency. To determine if this trade-off was beneficial, we compared the measured signal intensities from all peptides from analyses run with both isolation widths (Figure 3-­‐7). Minimal effects of altered isolation 119 width were seen for the highest peptide levels, whereas only the 0.1m/z isolation window was able to clearly distinguish the median of the 1fmol standard from the median of the 10fmol standard. At the lowest peptide levels, the narrower isolation width was better able to isolate precursor ions, ultimately providing a larger dynamic range for calibration. However, for peptides with poor response ratios, the losses associated with the narrower isolation width ultimately limit the sensitivity and therefore the dynamic range. Frac-on$of$Signal$Intensity$ 1$ 0.1$ 0.7$m/z$ 0.1$m/z$ 0.01$ 0.001$ $$$$1$fmol$ $$$$10$fmol$ $$$$100$fmol$ $$$1000$fmol$ Figure 3-­‐7: Q1 isolation Width Comparison. 0.7 m/z (red) and 0.1 m/z (blue). 3.3.5 Absolute Quantification of Tyrosine Phosphorylation in the EGFR signaling network Previous studies have described the temporal dynamics of individual tyrosine phosphorylation sites following EGFR activation8. Although these studies have highlighted many novel phosphorylation sites in the network and have led to hypotheses regarding the function of selected sites, they have been inherently limited by quantification relative to a selected time point. As such, it has been difficult to quantify 120 differences in phosphorylation levels across different sites on the receptor, or to determine whether different ligands led to different phosphorylation stoichiometries. To address this deficiency and gain insight into the temporal phosphorylation profiles of multiple sites in the network, we used MARQUIS to quantify absolute levels of tyrosine phosphorylation on EGFR and proximal adaptor proteins, at multiple time points following EGF stimulation. Although EGFR pY1173 is often cited as a dominant autophosphorylation site for the receptor9, and is often used as a measure of EGFR activation for immunoblotting studies10,11, MARQUIS data unequivocally demonstrates that phosphorylation of 1148 occurs approximately 4-5 fold greater than pY1173, pY1068, or pY1045, all of which occur at similar levels (Figure 3-­‐8a), a comparison that would be missed by typical MS-based relative quantification6 or immunoblotting techniques. Further downstream in the network, it is also possible to detect higher levels of phosphorylation on Erk2 pY187 as compared to Erk1 pY204 (Figure 3-­‐8b), an observation that aligns with higher ERK2 expression in other systems12. EGFR" pY1045" b)# 1800" pY1068" 400" pY1148" 1200" pY1173" 1000" 300" ERK" 1600" 500" 200" ERK2"pY187" 800" 600" 400" 100" 200" 0" 0" 0" 5" 10" 15" 20" 25" 30" 0" 5" 10" Time"(min.)" c)# 600" 15" 20" 25" 30" Time"(min.)" d)# 350" EGFR"S1142/Y1148" EGFR"Y1045/Y1068" 300" 500" 250" pSpY" 300" fmol" SpY" 400" fmol" ERK1"pY204" 1400" fmol" fmol" a)# 600" pYY" 200" YpY" 150" pYpY" 200" 100" 100" 50" 0" 0" 0" 5" 10" 15" Time"(min.)" 20" 25" 30" 0" 5" 10" 15" 20" 25" 30" Time"(min.)" Figure 3-­‐8: Stoichiometric comparison of site-­‐specific phosphorylation. Data for a collection of EGFR and network phosphorylation sites was gathered using MARQIS with Multiple Reaction Monitoring (MRM) on a triple quadrupole instrument. a) EGFR. b) ERK1/2. c) EGFR pY1148, pS1142/pY1148. d) EGFR pY1045, pY1068, and pY1045/pY1068. 121 Absolute quantification enables the tracking of individual phosphorylation sites, even when multiple sites are contained within the same tryptic peptide. For instance, the peptide containing Y1148 also contains S1142, a serine residue that has previously been found to be phosphorylated in these cells13. To identify whether pS1142 co-occurs with pY1148 on the same EGFR protein, we used MARQUIS for absolute quantification of the singly phosphorylated pY1148 peptide and the doubly phosphorylated pS1142/pY1148 peptide. By quantifying the singly and doubly phosphorylated isoforms of the peptide, we could more accurately account for the total amount of phosphorylation on Y1148 (Figure 3-­‐8c). This analysis led to the intriguing finding that pS1142/pY1148 is present in unstimulated cells at approximately 30fmol, but upon stimulation gradually decreases over time, with no appreciable increase following stimulation. The singly phosphorylated pY1148 is also present in unstimulated cells, but increases strongly following EGF stimulation. With absolute quantification, it is clear that the increase in phosphorylation of the singly phosphorylated pY1148 is much greater than the decrease in the doubly phosphorylated pS1142/pY1148 peptide. Therefore, the increase in singly phosphorylated pY1148 is likely due to phosphorylation of EGFR in the absence of pS1142, rather than de-phosphorylation of the pS1142/pY1148 species. It remains to be determined whether pS1142 inhibits phosphorylation of Y1148, but our data would be consistent with this hypothesis. Additional insights into phosphorylation dynamics are available when multiple tyrosine phosphorylation events occur on a given tryptic peptide. For instance, Y1045 and Y1068 fall within the same tryptic fragment, and therefore yield four possible phosphorylation isoforms for the peptide (Y1045/Y1068, pY1045/Y1068, Y1045/pY1068, and pY1045/pY1068), three of which were monitored by MARQUIS-based absolute quantification (Figure 3-­‐8d). Although all three isoforms are present at similar, low levels in unstimulated MCF10A cells, following stimulation Y1045/pY1068 and doubly phosphorylated pY1045/pY1068 both rise rapidly in the first minute. The pY1045/Y1068 peptide shows a much more gradual increase in the first minute, potentially indicating that this site is being converted to the pY1045/pY1068 form through rapid phosphorylation of Y1068. Interestingly, over the next several time points the singly phosphorylated Y1045/pY1068 isoform begins to decrease, while the 122 pY1045/Y1068 and pY1045/pY1068 isoforms both increase strongly, indicating phosphorylation of Y1045 that is converting Y1045/pY1068 into the doubly phosphorylated form, which is the most abundant of the three isoforms in this early time period. Finally, by five minutes all three of the phospho-isoforms begin to decrease at similar rates, potentially indicating the activity of a common phosphatase. The quantitative interplay between these isoforms has not been previously documented due to a lack of absolute quantification of phosphorylation level on each of these sites under different stimulation conditions. MARQUIS-based absolute quantification was also used to quantify EGFR phosphorylation dynamics in response to different EGFR ligands. It has previously been documented that stimulation with Epidermal Growth Factor (EGF), Transforming Growth Factor α (TGFα), or Amphgiregulin (AREG) leads to different downstream biological effects14-16, potentially due to qualitatively or quantitatively different signaling networks. To uncover the mechanism by which these different ligands activate different phenotypes, we quantified the temporal dynamics of phosphorylation levels on multiple sites on the receptor in response to different concentrations of each ligand (Appendix Table 3-­‐4). As can be seen in Figure 3-­‐9a-c, each ligand stimulates a rapid increase in receptor phosphorylation, albeit to different levels, with the Y1148 site consistently phosphorylated to a greater extent compared to the other three phosphorylation sites on the C-terminal tail of the receptor. Given that each ligand has different binding affinities, the difference in the level of phosphorylation on the receptor may not be surprising. Intriguingly, despite the quantitative difference in the level of phosphorylation due to each ligand (Figure 3-­‐9d), the pattern of phosphorylation on the receptor is effectively invariant with regard to ligand identity, with pY1148>pY1068>pY1173>pY1045; this similarity in receptor pattern can best be visualized by normalizing all sites for a given ligand to the total amount of phosphorylation on the receptor (Figure 3-­‐9e). The consistent pattern of receptor phosphorylation raises the intriguing possibility that biological response to different EGFR ligands may be encoded in the absolute level of phosphorylation on each site, rather than through the phosphorylation of different sites on the receptor (e.g. quantitative vs. qualitative control of RTK phosphorylation signaling 123 networks). It remains to be determined how quantitative control might affect downstream signaling networks and therefore regulate cell biological response to ligand stimulation. b)# a)# 60" c)# 60" EGF"20"nM" 30" 20" 10" Copies/cell"(x103)" 40" 40" 30" 20" 0" 0" 10" 20" 30" 0" Time"(min.)" d)# 10" 20" Time"(min.)" 30" e)# 40" Copies/cell"(x103)" 35" 30" 25" TGFα"20nM" 20" AREG"20nM" 15" EGF"20nM" 10" 5" 0" pY"1068" pY"1148" pY"1173" Median"Normalized"Copies/cell" 45" pY"1045" EGFR"pY"1045" EGFR"pY"1068" 40" EGFR"pY"1148" 30" EGFR"pY"1173" 20" 10" 10" 0" AREG"20"nM" 50" 50" Copies/cell"(x103)" Copies/cell"(x103)" 50" 60" TGFα"20"nM" 0" 0" 10" 20" 30" Time"(min.)" 3" 2.5" 2" TGFα"20nM" 1.5" AREG"20nM" EGF"20nM" 1" 0.5" 0" pY"1045" pY"1068" pY"1148" pY"1173" Figure 3-­‐9: Ligand Comparison. Absolute quantification for EGFR phosphorylation sites treated with EGF (a), TGFα (b), and AREG (c). Comparison of site-­‐specific phosphorylation between ligands following five minutes of ligand stimulation for two biological replicates; raw (d) and normalized (e). 3.4 Conclusion MARQUIS is a novel multi-point internal standard calibration method that enables absolute quantification of site-specific PTM levels or protein expression. The method provides accurate quantification over a broad range of concentrations allowing simultaneous analysis of stimulation conditions that lead to large changes in expression or modification levels. With MARQUIS, absolute quantification data can be collected for many PTM sites of interest in the network in a single analysis, using either MRM or PRM-based targeted mass spectrometry. This data can then be used for intra- and intersample site-specific comparisons. 124 MARQUIS was designed for MRM-based analysis on a triple quadrupole, but the method is equally applicable to targeted full-scan targeted tandem mass spectrometry (PRM). Overall, the two methods performed very well in terms of technical reproducibility with a CV threshold of 0.15 able to capture the majority of measurements with both techniques. Both instrument platforms offer selective advantages. For instance, the full-scan method offers the ability to determine contamination by way of the complete MS/MS data at high resolution that is not available in an MRM experiment. On the other hand, when using MRM, the precursor (Q1) isolation window can be tightened to a greater extent compared to the PRM method, thereby improving signal-to-noise by reducing contributions from spurious co-eluting ions. Although a Q1 isolation width of 0.7m/z was typically used for the data acquired in this study, sufficient signal for quantification can be attained for many peptides by using a window as small as 0.1m/z. Such a narrow window would be deleterious to the PRM method as the drop in ion flux may eliminate signal for most low and medium intensity fragments. The MRM approach is inherently more flexible, as the fragmentation energy and time spent per MRM transition can be altered for selected transitions to improve the sensitivity. However, increasing the dwell time is only feasible for a small number of transitions; otherwise the increase in cycle time becomes too great. Recent structural insights into the cooperativity of ligand binding and EGFR dimerization suggest that differential ligand affinities for the extracellular portion of the receptor influence the occupancy and the receptor conformation and potentially signaling and trafficking17. A previous mass spectrometry-based study of receptor phosphorylation observed site-specific quantitative differences in receptor phosphorylation in response to varied doses of TGFα and EGF; stoichiometry was estimated by measuring the ratio of signal intensities between phosphorylated and non-phosphorylated versions of tyrosine containing tryptic fragments18. Such comparisons are limited due to the differential ionization potential between these species. Here we have shown large differences not only between various ligand stimulation conditions for a single site, but also between different EGFR sites not present on the same tryptic peptide. The MARQUIS method enables quantification of EGFR phosphorylation stoichiometry, information necessary to decipher how extracellular ligand concentration information is 125 transmitted across the cellular membrane. Unique sets of SH2- and PTB-containing proteins bind to each receptor phosphorylation site19,20, suggesting the ability to activate distinct pathways by phosphorylating different sites. Alternately, differential pathway activation could be controlled by altering the level of phosphorylation. While this option might suggest that downstream pathways will be linearly proportional to receptor phosphorylation levels, different pathways feature amplification gains, and thus one pathway might be saturated by a relatively small signal on the receptor, as has been detected for the ERK MAPK pathway, while other pathways may be more analog in nature. Here, through absolute quantification of specific phosphorylation sites on the receptor, we determined that the EGFR phosphorylation response to stimulation with different EGFR ligands is quantitatively regulated. Although phosphorylation levels are ligand specific, the ratio between different receptor sites is constant across different EGFR ligands. Thus, different biological responses may be regulated by the analog vs. digital nature of the different downstream pathways, in a manner analogous to the interleukin receptor system we have recently investigated21. Additional studies are needed to confirm the effects of quantitative regulation of receptor phosphorylation on downstream pathways, as well as to determine the cellular conditions that might lead to altered phosphorylation patterns on the receptor: is this a mechanism by which the cell differentiates normal vs. pathological signaling? Absolute quantification of protein expression and PTM levels will be required to provide insight into regulation of many biological systems. The MARQUIS method enables facile acquisition of absolute quantification data across multiple biological conditions simultaneously, with large dynamic range, and with internal standard curves to improve the accuracy of the measurements. This method is broadly applicable to multiple mass spectrometry platforms and to different biological systems. 126 3.5 Acknowledgments We thank Andreas FR Huhmer for his support of the project by providing access to the Thermo Scientific Q Exactive instrument and for the critical review of the manuscript. The authors would like to thanks members of the White lab for helpful discussion. This work was supported by in part by a grant from the US-Israel Binational Science Foundation and by NIH grants U54 CA112967, R01 CA118705, and R01 CA096504. 127 3.6 Appendix Tables Protein EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR ERK1 ERK1 ERK1 ERK1 ERK1 ERK1 ERK1 ERK1 ERK1 ERK1 ERK2 ERK2 ERK2 ERK2 ERK2 ERK2 ERK2 ERK2 ERK2 ERK2 Shc Shc Shc Shc Shc Shp2 Shp2 Shp2 Shp2 Shp2 Src Src Src Src Src 2.01E+04 5.65E+04 4.29E+04 7.40E+04 4.14E+04 1.14E+04 6.95E+03 1.97E+04 1.41E+04 1.25E+07 8.23E+06 3.63E+06 4.99E+06 3.01E+06 9.37E+05 1.70E+06 5.34E+06 3.42E+06 6.46E+05 1.98E+06 4.91E+05 3.10E+06 2.77E+06 1.63E+07 1.12E+07 4.63E+06 3.22E+06 1.43E+06 1.80E+06 1.15E+06 3.83E+05 6.45E+05 1.96E+06 1.22E+06 2.25E+05 7.95E+05 5.05E+05 3.28E+06 2.36E+06 1.35E+05 2.02E+05 5.06E+05 1.59E+05 6.06E+05 1.34E+05 5.50E+05 6.84E+05 3.86E+06 2.42E+06 9.92E+05 7.31E+05 3.31E+05 5.15E+05 3.34E+05 1.15E+05 1.71E+05 5.52E+05 3.56E+05 3.53E+05 2.25E+05 2.35E+05 9.40E+04 3.33E+05 2.54E+05 1.42E+06 1.22E+06 6.31E+04 9.66E+04 2.24E+05 8.27E+04 2.98E+05 6.69E+04 2.32E+05 2.39E+05 1.31E+06 8.78E+05 4.85E+05 3.16E+05 1.41E+05 2.79E+05 1.56E+05 5.72E+04 9.49E+04 2.89E+05 1.86E+05 2.94E+06 2.27E+06 7.01E+05 2.93E+05 3.52E+05 1.69E+05 6.01E+05 5.41E+05 2.49E+06 2.06E+06 2.88E+04 4.35E+04 1.09E+05 2.69E+04 1.21E+05 2.47E+04 3.64E+05 3.15E+05 1.81E+06 1.21E+06 2.45E+05 1.66E+05 7.00E+04 1.37E+05 7.95E+04 2.14E+04 3.91E+04 1.14E+05 6.67E+04 2.48E+06 1.87E+06 6.00E+05 1.69E+05 1.90E+05 1.31E+05 5.12E+05 3.77E+05 1.98E+06 1.57E+06 5.27E+04 8.65E+04 2.01E+05 4.53E+04 1.72E+05 3.96E+04 1.16E+05 9.06E+04 5.74E+05 3.68E+05 4.28E+05 2.97E+05 1.40E+05 1.19E+05 7.90E+04 2.41E+04 4.67E+04 1.10E+05 6.61E+04 9.36E+05 6.07E+05 2.31E+05 1.21E+05 1.28E+05 4.70E+04 1.55E+05 1.41E+05 6.89E+05 5.11E+05 1.57E+04 2.73E+04 8.01E+04 1.84E+04 6.28E+04 1.45E+04 3.97E+04 4.17E+04 2.20E+05 1.42E+05 1.35E+05 8.70E+04 3.51E+04 1.19E+05 6.92E+04 2.16E+04 4.44E+04 1.40E+05 7.35E+04 4.22E+06 3.21E+06 9.68E+05 3.31E+05 3.45E+05 2.35E+05 8.17E+05 6.11E+05 3.45E+06 2.47E+06 2.92E+04 5.23E+04 1.23E+05 1.45E+04 7.08E+04 1.56E+04 1.36E+05 1.31E+05 7.68E+05 4.59E+05 2.67E+05 1.64E+05 6.31E+04 9.01E+04 4.43E+04 1.92E+04 2.39E+04 7.90E+04 4.48E+04 2.39E+01 5.10E+01 2.80E+01 1.11E+01 4.70E+01 6.00E+01 3.64E+01 4.35E+01 4.49E+01 2.00E+01 2.58E+01 2.82E+01 2.32E+01 2.58E+01 2.16E+01 1.05E+02 1.01E+02 1.09E+02 8.71E+00 7.61E+00 6.80E+01 6.26E+01 6.61E+01 4.98E+01 4.79E+01 9.73E+01 9.57E+01 3.39E+01 3.18E+01 2.45E+01 3.73E+01 1.34E+01 1.38E+01 1.32E+01 1.28E+01 4.58E+01 4.18E+01 6.40E+01 2.23E+01 3.74E+01 5.38E+01 1.82E+01 1.25E+01 1.57E+01 3.14E+01 2.57E+01 5.29E+01 1.14E+01 1.00E+01 9.20E+02 9.48E+02 9.89E+02 1.07E+03 9.24E+02 2.33E+01 2.41E+01 2.37E+01 2.47E+01 2.16E+01 3.51E+01 9.19E+01 8.39E+01 8.76E+01 4.38E+00 5.91E+00 5.53E+01 4.21E+01 4.96E+01 4.15E+01 4.03E+01 4.95E+01 4.39E+01 2.14E+01 1.46E+02 1.49E+02 1.59E+02 8.25E+01 7.53E+01 8.01E+01 7.89E+01 3.26E+02 3.05E+02 2.76E+02 1.35E+02 1.31E+02 1.37E+02 6.02E+01 5.41E+01 4.68E+01 2.91E+01 2.90E+01 2.23E+01 1.99E+01 1.24E+01 7.75E+02 8.58E+02 8.91E+02 9.33E+02 7.66E+02 1.81E+01 2.20E+01 1.80E+01 1.87E+01 2.29E+01 3.15E+01 2.18E+02 2.04E+02 2.00E+02 7.33E+00 4.22E+00 9.93E+01 8.84E+01 7.22E+01 8.36E+01 8.82E+01 2.23E+01 2.39E+01 1.32E+01 1.22E+02 1.90E+02 1.55E+02 7.49E+01 6.49E+01 7.72E+01 7.29E+01 3.45E+02 3.23E+02 2.61E+02 1.10E+02 1.10E+02 1.31E+02 6.22E+01 6.41E+01 6.81E+01 3.95E+01 5.23E+01 4.02E+01 2.49E+01 2.16E+01 1.40E+03 1.49E+03 1.80E+03 1.53E+03 1.09E+03 2.54E+01 2.44E+01 2.72E+01 2.83E+01 2.17E+01 3.70E+01 5.89E+02 5.24E+02 5.10E+02 1.25E+01 7.83E+00 2.14E+02 2.08E+02 2.52E+02 2.39E+02 2.18E+02 4.74E+01 2.93E+01 3.38E+01 2.16E+02 2.85E+02 2.70E+02 1.06E+02 9.86E+01 9.90E+01 9.08E+01 5.05E+02 4.28E+02 4.38E+02 1.52E+02 1.53E+02 1.71E+02 1.66E+02 1.32E+02 1.05E+02 3.94E+01 3.44E+01 8.25E+01 9.73E+01 7.73E+01 9.30E+02 1.17E+03 1.13E+03 1.19E+03 8.26E+02 1.67E+01 1.81E+01 1.31E+01 1.87E+01 1.77E+01 3.33E+01 1.11E+03 1.04E+03 1.05E+03 1.78E+01 1.54E+01 3.22E+02 4.13E+02 3.68E+02 4.50E+02 4.25E+02 1.66E+01 2.14E+01 1.67E+01 1.48E+02 1.82E+02 1.75E+02 1.08E+02 9.64E+01 1.06E+02 1.01E+02 4.30E+02 4.22E+02 3.94E+02 9.30E+01 1.09E+02 1.29E+02 9.61E+01 9.64E+01 9.33E+01 3.48E+01 2.91E+01 1.68E+02 1.73E+02 1.45E+02 1.21E+03 1.23E+03 1.23E+03 1.30E+03 1.01E+03 2.14E+01 2.10E+01 1.95E+01 2.40E+01 2.27E+01 3.73E+01 1.67E+03 1.58E+03 1.61E+03 3.07E+01 2.32E+01 6.01E+02 6.15E+02 5.55E+02 6.82E+02 6.20E+02 3.10E+01 2.73E+01 2.14E+01 9.63E+01 1.43E+02 1.44E+02 1.10E+02 9.74E+01 1.12E+02 1.06E+02 4.46E+02 4.12E+02 4.11E+02 1.58E+02 1.60E+02 1.52E+02 6.66E+01 6.06E+01 5.89E+01 1.69E+01 1.77E+01 1.05E+02 1.29E+02 1.23E+02 9.63E+02 8.78E+02 8.74E+02 9.52E+02 7.73E+02 3.07E+01 2.50E+01 2.70E+01 3.05E+01 3.15E+01 1.95E+01 1.44E+03 1.36E+03 1.41E+03 2.20E+01 2.25E+01 5.54E+02 5.20E+02 5.41E+02 5.73E+02 5.24E+02 6.96E+00 1.62E+01 1.10E+01 7.82E+01 1.04E+02 7.16E+01 7.09E+01 7.07E+01 6.53E+01 6.27E+01 3.78E+02 3.84E+02 3.47E+02 1.62E+02 1.04E+02 1.33E+02 5.72E+01 5.01E+01 5.66E+01 1.86E+01 1.78E+01 1.41E+02 1.17E+02 1.06E+02 4.05E+02 3.75E+02 4.59E+02 4.76E+02 4.15E+02 2.24E+01 1.74E+01 2.27E+01 2.55E+01 2.16E+01 1.72E+01 1.10E+03 1.01E+03 1.00E+03 1.74E+01 1.51E+01 4.16E+02 4.02E+02 5.57E+02 4.73E+02 4.18E+02 4.87E+00 4.19E+00 3.98E+00 2.09E+01 2.58E+01 1.76E+01 1.82E+01 1.97E+01 1.83E+01 1.81E+01 1.72E+02 1.42E+02 1.49E+02 7.16E+01 6.46E+01 5.48E+01 1.64E+01 1.40E+01 1.53E+01 %30%min%(121) 2.09E+04 4.65E+04 9.10E+06 9.47E+06 5.70E+07 3.57E+07 5.32E+05 9.11E+05 2.27E+06 4.74E+05 5.77E+05 1.49E+06 1.17E+06 4.90E+04 3.58E+05 3.59E+05 6.11E+04 1.10E+05 2.50E+05 4.02E+05 2.06E+05 5.73E+04 7.56E+04 1.07E+05 2.61E+05 1.14E+05 3.76E+04 2.32E+01 %10%min%(119) 2.58E+04 6.85E+04 4.40E+04 1.34E+05 5.49E+04 1.75E+05 9.72E+04 1.76E+06 5.24E+06 1.33E+06 8.27E+05 2.62E+06 2.03E+06 1.06E+07 8.41E+06 9.04E+05 4.06E+04 %5%min%(118) 2.39E+04 8.18E+04 4.57E+04 4.54E+04 2.69E+05 1.71E+05 1.50E+06 2.29E+06 6.13E+06 1.44E+06 1.79E+06 3.72E+06 2.80E+06 2.80E+04 1.41E+05 1.27E+05 4.38E+04 9.07E+04 2.01E+05 3.40E+05 2.03E+05 1.07E+06 9.97E+05 2.60E+06 4.12E+06 1.81E+06 6.46E+03 7.80E+04 %3%min%(117) 4.84E+04 1.46E+05 9.42E+04 1.31E+05 8.36E+04 3.06E+05 2.16E+05 6.13E+03 2.42E+04 4.25E+03 2.31E+06 7.49E+06 5.70E+06 3.25E+07 2.46E+07 3.07E+06 9.54E+04 %2%min%(116) 2.80E+04 9.69E+04 1.35E+05 1.35E+05 7.67E+05 4.91E+05 8.19E+02 9.70E+02 2.60E+03 4.79E+06 5.48E+06 1.23E+07 9.95E+06 3.96E+04 2.12E+05 2.18E+05 2.84E+05 4.56E+05 1.01E+06 1.67E+06 9.45E+05 3.57E+05 3.12E+05 8.36E+05 1.43E+06 6.21E+05 3.28E+04 1.73E+05 %1%min%(115) 3.81E+04 1.03E+05 5.74E+04 1.03E+05 7.67E+04 3.61E+05 2.29E+05 1.90E+04 7.77E+04 1.43E+04 3.16E+05 1.05E+06 9.90E+05 5.07E+06 3.47E+06 8.61E+06 7.46E+04 %30%sec%(114) 1.96E+04 1.28E+05 2.09E+05 2.06E+05 1.31E+06 8.20E+05 8.85E+02 3.12E+03 5.76E+03 2.85E+04 2.90E+04 3.74E+07 2.90E+07 6.77E+04 4.04E+05 3.88E+05 1.71E+05 2.56E+05 5.45E+05 9.60E+05 5.36E+05 2.06E+05 1.78E+05 4.70E+05 8.13E+05 3.40E+05 1.01E+05 1.78E+05 Amount 0%min%(113) 2.15E+04 6.32E+04 4.33E+04 8.68E+04 5.85E+04 4.69E+05 3.17E+05 2.59E+04 1.07E+05 2.84E+04 3.10E+05 1.03E+06 7.97E+05 4.91E+06 3.59E+06 3.06E+06 2.62E+05 %1%fmol%(121) 1.89E+04 7.56E+04 2.82E+05 2.65E+05 1.67E+06 1.06E+06 3.94E+03 5.78E+03 1.12E+04 2.87E+04 3.57E+04 1.38E+07 1.00E+07 4.50E+04 2.86E+05 2.52E+05 1.91E+05 3.25E+05 6.53E+05 1.03E+06 6.15E+05 3.88E+05 3.76E+05 9.60E+05 1.52E+06 7.63E+05 7.52E+04 5.14E+05 %3%fmol%(119) 2.68E+04 7.18E+04 4.06E+04 7.92E+04 4.89E+04 2.82E+05 1.75E+05 5.17E+04 1.84E+05 4.66E+04 3.41E+05 1.22E+06 9.04E+05 5.84E+06 4.19E+06 3.25E+06 1.64E+05 %10%fmol%(118) 2.71E+04 5.63E+04 1.47E+05 1.24E+05 8.03E+05 5.14E+05 2.91E+03 5.90E+03 1.18E+04 4.01E+04 3.63E+04 1.44E+07 1.11E+07 1.23E+05 6.61E+05 7.17E+05 3.78E+05 6.47E+05 1.38E+06 2.19E+06 1.20E+06 1.10E+06 9.88E+05 2.62E+06 4.07E+06 1.98E+06 2.12E+05 3.08E+05 %30%fmol%(117) 8.10E+03 1.76E+04 1.41E+04 1.32E+05 7.94E+04 2.44E+05 1.53E+05 3.44E+04 1.48E+05 3.90E+04 2.90E+05 1.13E+06 7.79E+05 5.18E+06 3.88E+06 3.72E+06 5.50E+05 %100%fmol%(116) 1.05E+04 8.19E+04 1.31E+05 1.23E+05 7.93E+05 4.78E+05 4.41E+03 3.69E+03 1.22E+04 3.12E+04 3.25E+04 1.68E+07 1.27E+07 4.28E+05 2.11E+06 2.30E+06 1.69E+06 2.75E+06 5.67E+06 9.31E+06 5.45E+06 4.30E+06 4.17E+06 1.03E+07 1.50E+07 7.17E+06 8.22E+05 1.08E+06 %300%fmol%(115) 2.29E+04 2.33E+04 3.11E+05 1.97E+05 2.93E+04 1.24E+05 2.59E+04 8.48E+04 3.01E+05 2.44E+05 1.41E+06 1.08E+06 3.34E+06 2.20E+06 %1%pmol%(114) 1.87E+04 1.97E+05 1.83E+05 1.11E+06 7.07E+05 2.59E+03 4.53E+03 6.01E+03 1.13E+04 8.99E+03 1.49E+07 1.13E+07 1.25E+06 6.72E+06 6.95E+06 4.35E+06 7.25E+06 1.50E+07 2.57E+07 1.59E+07 1.13E+07 1.08E+07 2.70E+07 4.13E+07 1.96E+07 2.74E+06 4.59E+06 STD%iTRAQ 3%pmol%(113) 4.62E+04 2.79E+04 4.53E+04 1.32E+05 3.63E+04 5.45E+04 1.60E+05 1.05E+05 6.25E+05 5.07E+05 8.63E+05 7.40E+06 %30%min%(121) 3.17E+04 3.36E+04 1.93E+05 1.18E+05 7.92E+03 1.07E+04 1.32E+04 8.36E+03 5.61E+03 4.06E+06 3.09E+06 4.54E+04 2.15E+05 2.06E+05 2.31E+05 3.78E+05 8.54E+05 1.61E+06 8.79E+05 4.83E+04 3.81E+04 1.11E+05 2.03E+05 8.31E+04 1.25E+04 1.43E+07 %10%min%(119) 9.83E+03 2.29E+04 8.80E+03 3.67E+04 1.05E+05 9.28E+04 4.18E+05 3.16E+05 4.23E+05 2.80E+04 1.88E+05 1.97E+05 4.04E+05 6.69E+05 1.35E+06 2.56E+06 1.35E+06 4.87E+04 4.14E+04 1.10E+05 1.94E+05 9.98E+04 1.04E+04 3.33E+04 %5%min%(118) 1.54E+04 2.34E+04 2.21E+04 6.74E+03 1.07E+04 1.90E+06 1.40E+06 4.95E+04 2.52E+05 2.29E+05 5.18E+05 9.38E+05 2.10E+06 3.51E+06 1.73E+06 3.46E+04 3.48E+04 8.75E+04 1.52E+05 7.07E+04 2.02E+04 7.31E+04 %3%min%(117) 5.26E+04 1.54E+05 1.24E+05 5.31E+05 3.88E+05 2.54E+05 3.16E+04 1.91E+05 1.64E+05 6.31E+05 1.23E+06 2.51E+06 4.33E+06 1.92E+06 4.28E+04 4.14E+04 7.63E+04 1.59E+05 7.47E+04 2.88E+04 2.74E+04 %2%min%(116) 1.42E+04 1.43E+04 1.08E+06 7.84E+05 7.03E+03 2.51E+04 2.49E+04 4.18E+05 8.29E+05 1.82E+06 2.85E+06 1.38E+06 2.86E+04 2.96E+04 7.25E+04 1.24E+05 4.96E+04 1.40E+04 5.47E+04 %1%min%(115) 3.12E+05 5.87E+03 2.53E+04 1.66E+04 3.20E+05 5.99E+05 1.36E+06 2.20E+06 1.12E+06 2.83E+04 3.33E+04 7.25E+04 1.04E+05 6.08E+04 1.66E+04 4.97E+04 %30%sec%(114) 1.31E+06 9.78E+05 1.79E+04 1.97E+04 1.83E+04 4.59E+05 9.07E+05 1.94E+06 3.41E+06 1.85E+06 4.39E+04 5.02E+04 1.22E+05 1.84E+05 7.84E+04 2.23E+04 1.11E+05 ENDO%iTRAQ 0%min%(113) 1.73E+04 5.10E+04 2.09E+04 2.73E+04 5.68E+04 7.12E+04 1.46E+05 9.26E+04 4.40E+04 5.31E+04 1.46E+05 1.84E+05 9.66E+04 1.60E+04 7.63E+04 128 1.70E+05 5.30E+04 9.26E+04 4.44E+04 7.92E+04 5.04E+04 1.17E+05 4.70E+04 8.90E+04 Run R5 R4 R3 R2 R1 R5 R4 R3 R2 R1 R5 R4 R3 R2 R1 R5 R4 R3 R2 R1 R5 R4 R3 R2 R1 R5 R4 R3 R2 R1 R5 R4 R3 R2 R1 R5 R4 R3 R2 R1 R5 R4 R3 R2 R1 R5 R4 R3 R2 R1 R5 R4 R3 R2 R1 R5 R4 R3 R2 R1 R5 R4 R3 R2 R1 Sequence ySSDPTGALTEDSIDDTFLPVPEYINQSVPK ySSDPTGALTEDSIDDTFLPVPEYINQSVPK ySSDPTGALTEDSIDDTFLPVPEYINQSVPK ySSDPTGALTEDSIDDTFLPVPEYINQSVPK ySSDPTGALTEDSIDDTFLPVPEYINQSVPK YSSDPTGALTEDSIDDTFLPVPEyINQSVPK YSSDPTGALTEDSIDDTFLPVPEyINQSVPK YSSDPTGALTEDSIDDTFLPVPEyINQSVPK YSSDPTGALTEDSIDDTFLPVPEyINQSVPK YSSDPTGALTEDSIDDTFLPVPEyINQSVPK GSHQISLDNPDyQQDFFPK GSHQISLDNPDyQQDFFPK GSHQISLDNPDyQQDFFPK GSHQISLDNPDyQQDFFPK GSHQISLDNPDyQQDFFPK GSTAENAEyLR GSTAENAEyLR GSTAENAEyLR GSTAENAEyLR GSTAENAEyLR ySSDPTGALTEDSIDDTFLPVPEyINQSVPK ySSDPTGALTEDSIDDTFLPVPEyINQSVPK ySSDPTGALTEDSIDDTFLPVPEyINQSVPK ySSDPTGALTEDSIDDTFLPVPEyINQSVPK ySSDPTGALTEDSIDDTFLPVPEyINQSVPK GSHQIsLDNPDyQQDFFPK GSHQIsLDNPDyQQDFFPK GSHQIsLDNPDyQQDFFPK GSHQIsLDNPDyQQDFFPK GSHQIsLDNPDyQQDFFPK IADPEHDHTGFLTEyVATR IADPEHDHTGFLTEyVATR IADPEHDHTGFLTEyVATR IADPEHDHTGFLTEyVATR IADPEHDHTGFLTEyVATR IADPEHDHTGFLtEyVATR IADPEHDHTGFLtEyVATR IADPEHDHTGFLtEyVATR IADPEHDHTGFLtEyVATR IADPEHDHTGFLtEyVATR VADPDHDHTGFLTEyVATR VADPDHDHTGFLTEyVATR VADPDHDHTGFLTEyVATR VADPDHDHTGFLTEyVATR VADPDHDHTGFLTEyVATR VADPDHDHTGFLtEyVATR VADPDHDHTGFLtEyVATR VADPDHDHTGFLtEyVATR VADPDHDHTGFLtEyVATR VADPDHDHTGFLtEyVATR ELFDDPSyVNVQNLDK ELFDDPSyVNVQNLDK ELFDDPSyVNVQNLDK ELFDDPSyVNVQNLDK ELFDDPSyVNVQNLDK IQNTGDyYDLYGGEK IQNTGDyYDLYGGEK IQNTGDyYDLYGGEK IQNTGDyYDLYGGEK IQNTGDyYDLYGGEK LIEDNEyTAR LIEDNEyTAR LIEDNEyTAR LIEDNEyTAR LIEDNEyTAR Site 1045 1045 1045 1045 1045 1068 1068 1068 1068 1068 1148 1148 1148 1148 1148 1173 1173 1173 1173 1173 1045,%1068 1045,%1068 1045,%1068 1045,%1068 1045,%1068 1142,%1148 1142,%1148 1142,%1148 1142,%1148 1142,%1148 204 204 204 204 204 202,%204 202,%204 202,%204 202,%204 202,%204 187 187 187 187 187 185,%187 185,%187 185,%187 185,%187 185,%187 317 317 317 317 317 62 62 62 62 62 418 418 418 418 418 Table 3-­‐1: MRM Quantification of endogenous tyrosine phosphorylated peptides. Raw iTRAQ intensities for endogenous peptides (columns E-­‐L) and corresponding heavy-­‐ labeled standard peptides (columns M-­‐T) across technical replicate analysis of EGF stimulated MCF10A cells. Calculated amounts of endogenous peptides based on regression to the standard curve are contained in columns U-­‐AB. Protein EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR ERK1 ERK1 ERK1 ERK1 ERK1 ERK1 ERK2 ERK2 ERK2 ERK2 ERK2 ERK2 Shc Shc Shc Shp2 Shp2 Shp2 Src Src Src Site 1045 1045 1045 1068 1068 1068 1148 1148 1148 1173 1173 1173 1045,41068 1045,41068 1045,41068 1142,41148 1142,41148 1142,41148 204 204 204 202,4204 202,4204 202,4204 187 187 187 185,4187 185,4187 185,4187 317 317 317 62 62 62 418 418 418 Full$Apex Full$Average SIM$Apex SIM$Average Endo4iTRAQ Sequence Run ENDO STD Ratio ENDO STD Ratio ENDO STD Ratio ENDO STD Ratio Average4Ratio Multiple Total4ENDO 113 ySSDPTGALTEDSIDDTFLPVPEYINQSVPK R1 1.73E+06 2.39E+07 7.24E$02 7.21E+05 9.87E+06 7.30E$02 3.13E+06 3.81E+07 8.22E$02 1.18E+06 1.46E+07 8.08E$02 7.71E$02 1.29E+00 4.44E+02 5.13E+03 ySSDPTGALTEDSIDDTFLPVPEYINQSVPK R2 7.22E+05 9.96E+06 7.25E$02 4.64E+05 6.37E+06 7.29E$02 8.02E+05 1.51E+07 5.30E$02 9.32E+05 1.40E+07 6.64E$02 6.62E$02 1.29E+00 3.81E+02 4.37E+03 ySSDPTGALTEDSIDDTFLPVPEYINQSVPK R3 7.59E+06 1.06E+08 7.15E$02 5.38E+06 8.16E+07 6.60E$02 7.80E+06 9.26E+07 8.42E$02 7.42E+06 8.46E+07 8.78E$02 7.73E$02 1.29E+00 4.45E+02 2.11E+04 YSSDPTGALTEDSIDDTFLPVPEyINQSVPK R1 1.04E+06 2.04E+07 5.10E$02 8.00E+05 1.81E+07 4.42E$02 1.99E+06 3.49E+07 5.70E$02 8.52E+05 1.67E+07 5.10E$02 5.08E$02 1.65E+00 3.73E+02 4.76E+03 YSSDPTGALTEDSIDDTFLPVPEyINQSVPK R2 5.35E+05 1.16E+07 4.62E$02 2.28E+05 4.91E+06 4.64E$02 1.26E+06 2.45E+07 5.16E$02 8.24E+05 1.63E+07 5.05E$02 4.87E$02 1.65E+00 3.57E+02 2.95E+03 YSSDPTGALTEDSIDDTFLPVPEyINQSVPK R3 5.09E+06 9.64E+07 5.28E$02 4.27E+06 8.68E+07 4.92E$02 6.02E+06 9.96E+07 6.05E$02 5.59E+06 9.25E+07 6.04E$02 5.57E$02 1.65E+00 4.09E+02 1.29E+04 GSHQISLDNPDyQQDFFPK R1 2.18E+06 2.19E+07 9.95E$02 1.61E+06 1.62E+07 9.94E$02 2.53E+06 2.44E+07 1.04E$01 1.51E+06 1.49E+07 1.01E$01 1.01E$01 4.22E+00 1.89E+03 8.14E+03 GSHQISLDNPDyQQDFFPK R2 9.55E+05 9.75E+06 9.79E$02 6.61E+05 6.54E+06 1.01E$01 1.91E+06 1.77E+07 1.08E$01 1.14E+06 1.06E+07 1.08E$01 1.04E$01 4.22E+00 1.94E+03 9.35E+03 GSHQISLDNPDyQQDFFPK R3 1.25E+07 1.31E+08 9.54E$02 6.20E+06 6.93E+07 8.95E$02 1.25E+07 1.37E+08 9.12E$02 9.14E+06 9.36E+07 9.76E$02 9.34E$02 4.22E+00 1.75E+03 2.57E+04 GSTAENAEyLR R1 7.53E+06 1.18E+08 6.38E$02 4.94E+06 8.06E+07 6.13E$02 5.54E+06 7.05E+07 7.86E$02 4.19E+06 5.44E+07 7.70E$02 7.02E$02 1.29E+00 4.01E+02 4.41E+04 GSTAENAEyLR R2 4.07E+06 6.30E+07 6.46E$02 3.12E+06 5.01E+07 6.23E$02 6.34E$02 1.29E+00 3.62E+02 3.65E+04 GSTAENAEyLR R3 3.80E+07 6.17E+08 6.16E$02 2.69E+07 4.37E+08 6.16E$02 6.16E$02 1.29E+00 3.52E+02 1.60E+05 ySSDPTGALTEDSIDDTFLPVPEyINQSVPK R1 7.90E+05 1.19E+07 6.64E$02 3.25E+05 4.71E+06 6.90E$02 1.07E+06 1.32E+07 8.11E$02 7.61E+05 1.15E+07 6.62E$02 7.07E$02 1.17E+00 3.66E+02 1.20E+04 ySSDPTGALTEDSIDDTFLPVPEyINQSVPK R2 3.99E+05 5.28E+06 7.56E$02 2.39E+05 3.30E+06 7.24E$02 9.99E+05 1.49E+07 6.69E$02 3.68E+05 5.52E+06 6.67E$02 7.04E$02 1.17E+00 3.65E+02 7.83E+03 ySSDPTGALTEDSIDDTFLPVPEyINQSVPK R3 4.06E+06 5.14E+07 7.91E$02 2.78E+06 3.74E+07 7.42E$02 5.85E+06 8.05E+07 7.26E$02 3.47E+06 4.74E+07 7.33E$02 7.48E$02 1.17E+00 3.88E+02 2.21E+04 GSHQIsLDNPDyQQDFFPK R1 7.31E+04 1.63E+07 4.48E$03 4.60E+04 1.17E+07 3.93E$03 3.10E+04 2.64E+07 1.17E$03 3.12E+04 2.16E+07 1.44E$03 2.76E$03 3.20E+00 3.92E+01 7.11E+03 GSHQIsLDNPDyQQDFFPK R2 3.74E+04 6.96E+06 5.38E$03 2.70E+04 6.66E+06 4.06E$03 2.40E+04 1.86E+07 1.29E$03 1.24E+04 1.11E+07 1.12E$03 2.96E$03 3.20E+00 4.21E+01 3.55E+04 GSHQIsLDNPDyQQDFFPK R3 4.09E+05 1.18E+08 3.47E$03 3.03E+05 1.10E+08 2.75E$03 4.92E+05 1.83E+08 2.69E$03 3.01E+05 1.18E+08 2.55E$03 2.87E$03 3.20E+00 4.08E+01 4.33E+04 IADPEHDHTGFLTEyVATR R1 1.12E+07 3.13E+07 3.58E$01 9.26E+06 2.59E+07 3.58E$01 1.40E+07 3.56E+07 3.93E$01 7.89E+06 2.08E+07 3.79E$01 3.72E$01 1.02E+00 1.68E+03 1.28E+04 IADPEHDHTGFLTEyVATR R2 1.17E+07 3.04E+07 3.85E$01 5.69E+06 1.51E+07 3.78E$01 1.49E+07 3.83E+07 3.88E$01 9.44E+06 2.31E+07 4.09E$01 3.90E$01 1.02E+00 1.76E+03 1.90E+04 IADPEHDHTGFLTEyVATR R3 4.97E+07 1.41E+08 3.52E$01 3.43E+07 9.77E+07 3.51E$01 3.95E+07 1.01E+08 3.91E$01 2.92E+07 7.97E+07 3.66E$01 3.65E$01 1.02E+00 1.65E+03 4.86E+04 IADPEHDHTGFLtEyVATR R1 6.07E+05 3.88E+07 1.56E$02 6.58E+05 6.96E+07 9.45E$03 1.25E$02 1.24E+00 6.92E+01 3.24E+03 IADPEHDHTGFLtEyVATR R2 4.04E+05 3.32E+07 1.21E$02 3.66E+05 3.00E+07 1.22E$02 1.22E$02 1.24E+00 6.71E+01 4.78E+03 IADPEHDHTGFLtEyVATR R3 2.13E+06 1.53E+08 1.39E$02 1.73E+06 1.33E+08 1.30E$02 1.35E$02 1.24E+00 7.42E+01 8.45E+03 VADPDHDHTGFLTEyVATR R1 5.69E+07 6.27E+07 9.07E$01 5.28E+07 5.56E+07 9.50E$01 4.86E+07 4.73E+07 1.03E+00 2.53E+07 2.93E+07 8.63E$01 9.37E$01 1.03E+00 4.28E+03 4.24E+04 VADPDHDHTGFLTEyVATR R2 7.63E+07 7.47E+07 1.02E+00 4.45E+07 4.38E+07 1.02E+00 7.96E+07 7.16E+07 1.11E+00 3.78E+07 3.77E+07 1.00E+00 1.04E+00 1.03E+00 4.74E+03 7.04E+04 VADPDHDHTGFLTEyVATR R3 2.24E+08 2.18E+08 1.03E+00 1.17E+08 1.14E+08 1.03E+00 1.54E+08 1.45E+08 1.06E+00 1.10E+08 1.04E+08 1.06E+00 1.04E+00 1.03E+00 4.77E+03 1.21E+05 VADPDHDHTGFLtEyVATR R1 5.66E+06 8.23E+07 6.88E$02 2.36E+06 3.59E+07 6.57E$02 5.02E+06 6.68E+07 7.51E$02 4.27E+06 6.26E+07 6.82E$02 6.95E$02 9.84E$01 3.04E+02 8.96E+03 VADPDHDHTGFLtEyVATR R2 2.19E+06 3.43E+07 6.37E$02 1.31E+06 2.10E+07 6.23E$02 3.54E+06 5.31E+07 6.67E$02 2.60E+06 3.68E+07 7.06E$02 6.58E$02 9.84E$01 2.88E+02 1.05E+04 VADPDHDHTGFLtEyVATR R3 1.11E+07 1.65E+08 6.73E$02 7.54E+06 1.14E+08 6.61E$02 1.12E+07 1.67E+08 6.71E$02 9.93E+06 1.47E+08 6.76E$02 6.70E$02 9.84E$01 2.93E+02 2.10E+04 ELFDDPSyVNVQNLDK R1 1.67E+07 4.00E+07 4.18E$01 1.34E+07 3.10E+07 4.32E$01 3.48E+07 8.02E+07 4.34E$01 1.50E+07 3.54E+07 4.24E$01 4.27E$01 2.60E+00 4.93E+03 1.51E+04 ELFDDPSyVNVQNLDK R2 1.02E+07 2.28E+07 4.45E$01 7.25E+06 1.72E+07 4.21E$01 1.67E+07 4.15E+07 4.01E$01 1.38E+07 3.30E+07 4.19E$01 4.22E$01 2.60E+00 4.87E+03 1.52E+04 ELFDDPSyVNVQNLDK R3 1.22E+08 2.74E+08 4.45E$01 8.57E+07 1.96E+08 4.37E$01 1.16E+08 3.60E+08 3.22E$01 9.42E+07 2.33E+08 4.04E$01 4.02E$01 2.60E+00 4.64E+03 5.97E+04 IQNTGDyYDLYGGEK R1 1.88E+05 1.45E+07 1.30E$02 1.34E+05 1.14E+07 1.18E$02 2.59E+05 1.85E+07 1.40E$02 1.88E+05 1.28E+07 1.47E$02 1.34E$02 1.78E+00 1.05E+02 5.39E+03 IQNTGDyYDLYGGEK R2 1.44E+05 1.18E+07 1.22E$02 8.04E+04 8.00E+06 1.00E$02 1.11E$02 1.78E+00 8.79E+01 6.13E+03 IQNTGDyYDLYGGEK R3 6.28E+06 4.82E+08 1.30E$02 3.88E+06 2.91E+08 1.33E$02 3.72E+06 2.58E+08 1.44E$02 2.84E+06 1.72E+08 1.65E$02 1.43E$02 1.78E+00 1.13E+02 9.43E+04 LIEDNEyTAR R1 2.52E+05 8.94E+06 2.82E$02 2.08E+05 7.84E+06 2.65E$02 2.74E$02 1.24E+00 1.51E+02 4.37E+05 LIEDNEyTAR R2 1.53E+05 4.44E+06 3.44E$02 1.25E+05 3.59E+06 3.49E$02 1.23E+05 7.28E+06 1.70E$02 1.27E+05 7.08E+06 1.79E$02 2.60E$02 1.24E+00 1.44E+02 2.89E+05 LIEDNEyTAR R3 2.48E+06 1.03E+08 2.41E$02 1.83E+06 7.31E+07 2.50E$02 2.46E$02 1.24E+00 1.36E+02 1.12E+06 114 4.59E+04 2.63E+04 1.45E+05 2.92E+04 1.18E+04 8.16E+04 3.93E+04 4.01E+04 1.45E+05 2.05E+04 2.02E+05 9.22E+05 1.98E+04 1.17E+04 7.14E+04 4.86E+03 2.50E+04 2.48E+04 1.30E+04 2.39E+04 4.86E+04 2.05E+03 3.66E+03 4.99E+03 4.69E+04 6.89E+04 1.48E+05 5.62E+03 7.40E+03 1.86E+04 5.81E+05 4.15E+05 1.81E+06 5.11E+03 5.02E+03 9.22E+04 1.70E+05 1.11E+05 5.22E+05 115 3.41E+04 1.74E+04 8.77E+04 2.22E+04 1.21E+04 6.22E+04 2.57E+04 2.79E+04 9.97E+04 1.39E+05 1.47E+05 7.05E+05 1.13E+04 9.36E+03 4.68E+04 2.83E+03 1.56E+04 1.15E+04 1.12E+04 2.10E+04 4.43E+04 1.59E+03 1.73E+03 2.42E+03 4.01E+04 6.08E+04 1.17E+05 4.90E+03 4.16E+03 9.10E+03 4.12E+05 2.74E+05 1.37E+06 4.00E+03 3.80E+03 6.33E+04 3.24E+04 2.38E+04 9.44E+04 116 4.02E+04 1.94E+04 9.61E+04 3.76E+04 1.83E+04 9.73E+04 3.16E+04 3.44E+04 1.25E+05 1.40E+05 1.51E+05 7.19E+05 1.10E+04 9.21E+03 4.74E+04 3.98E+03 1.62E+04 1.16E+04 4.09E+04 6.13E+04 1.38E+05 2.32E+03 2.77E+03 4.66E+03 1.65E+05 2.40E+05 4.25E+05 1.10E+04 9.18E+03 1.82E+04 5.32E+05 3.58E+05 1.76E+06 4.57E+03 3.77E+03 6.77E+04 1.29E+04 8.83E+03 3.65E+04 117 4.80E+04 2.63E+04 1.25E+05 4.90E+04 2.76E+04 1.42E+05 4.93E+04 5.56E+04 1.92E+05 3.06E+05 3.13E+05 1.49E+06 1.38E+04 1.35E+04 6.13E+04 5.62E+03 1.87E+04 2.14E+04 2.81E+05 4.21E+05 8.67E+05 2.66E+04 2.33E+04 4.45E+04 1.24E+06 1.75E+06 2.99E+06 1.30E+05 1.10E+05 2.09E+05 7.18E+05 4.97E+05 2.39E+06 4.60E+03 5.16E+03 8.32E+04 5.93E+03 4.38E+03 1.68E+04 118 4.88E+04 2.48E+04 1.34E+05 2.24E+04 1.21E+04 6.98E+04 3.52E+04 3.80E+04 1.43E+05 2.24E+05 2.35E+05 1.10E+06 1.49E+04 1.16E+04 6.23E+04 3.35E+03 1.53E+04 1.39E+04 2.71E+05 4.13E+05 8.61E+05 2.77E+04 2.47E+04 4.86E+04 1.25E+06 1.81E+06 3.05E+06 1.52E+05 1.29E+05 2.60E+05 4.93E+05 3.39E+05 1.61E+06 3.39E+03 3.42E+03 6.78E+04 1.62E+03 1.42E+03 4.83E+03 119 4.81E+04 2.59E+04 1.19E+05 1.74E+04 8.49E+03 4.62E+04 3.40E+04 3.45E+04 1.26E+05 1.46E+05 1.48E+05 6.91E+05 1.41E+04 9.98E+03 6.00E+04 2.87E+03 1.60E+04 6.95E+03 2.63E+05 3.93E+05 8.08E+05 2.51E+04 2.29E+04 4.42E+04 1.18E+06 1.69E+06 2.96E+06 1.39E+05 1.17E+05 2.15E+05 4.16E+05 2.84E+05 1.38E+06 4.75E+03 4.14E+03 8.09E+04 1.85E+03 6.52E+02 2.53E+03 Table 3-­‐2: PRM Quantification of endogenous tyrosine phosphorylated peptides. Raw MS1 intensities were collected for endogenous and heavy-­‐labeled standard peptides in either full scan (columns E-­‐J) or SIM mode (columns K-­‐P) for either single scan (E-­‐G, K-­‐M) or averaged (H-­‐J, N-­‐P) across the chromatographic elution profile. These estimates were averaged to produce the final ratio (column Q). Amino acid analysis was performed for each standard peptide to provide a correction factor for the concentration (column R). Total amount of standard after correction (column S) was multiple by the ratio in column Q to produce the total amount of endogenous (column T). iTRAQ intensities for endogenous peptides (columns U-­‐AB) were used to calculate the fractional iTRAQ values (AC-­‐AJ), which were then multiplied by the total endogenous peptide amount (column T) to produce the calculated amount of endogenous peptide in each sample (columns AK-­‐AR). Table continued on next page. 129 121 2.49E+04 1.23E+04 6.33E+04 5.06E+03 2.74E+03 1.58E+04 1.61E+04 1.67E+04 6.27E+04 4.21E+04 4.80E+04 2.14E+05 8.97E+03 5.17E+03 3.11E+04 2.51E+03 1.74E+04 3.18E+03 2.00E+05 3.38E+05 6.64E+05 2.16E+04 2.03E+04 3.94E+04 8.73E+05 1.36E+06 2.30E+06 1.21E+05 1.05E+05 2.12E+05 2.07E+05 1.65E+05 6.59E+05 4.75E+03 4.74E+03 8.61E+04 3.77E+02 4.44E+02 1.65E+03 Protein EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR ERK1 ERK1 ERK1 ERK1 ERK1 ERK1 ERK2 ERK2 ERK2 ERK2 ERK2 ERK2 Shc Shc Shc Shp2 Shp2 Shp2 Src Src Src Site 1045 1045 1045 1068 1068 1068 1148 1148 1148 1173 1173 1173 1045,&1068 1045,&1068 1045,&1068 1142,&1148 1142,&1148 1142,&1148 204 204 204 202,&204 202,&204 202,&204 187 187 187 185,&187 185,&187 185,&187 317 317 317 62 62 62 418 418 418 iTRAQ&Fraction Amount&(fmol) Sequence Run 113/Total 114/Total 115/Total 116/Total 117/Total 118/Total 119/Total 121/Total 113 ySSDPTGALTEDSIDDTFLPVPEYINQSVPK R1 1.74EN02 1.56EN01 1.16EN01 1.36EN01 1.63EN01 1.65EN01 1.63EN01 8.44EN02 7.71E+00 ySSDPTGALTEDSIDDTFLPVPEYINQSVPK R2 2.79EN02 1.68EN01 1.11EN01 1.24EN01 1.68EN01 1.58EN01 1.65EN01 7.86EN02 1.06E+01 ySSDPTGALTEDSIDDTFLPVPEYINQSVPK R3 2.68EN02 1.83EN01 1.11EN01 1.22EN01 1.58EN01 1.69EN01 1.50EN01 8.02EN02 1.19E+01 YSSDPTGALTEDSIDDTFLPVPEyINQSVPK R1 2.54EN02 1.56EN01 1.18EN01 2.00EN01 2.61EN01 1.19EN01 9.27EN02 2.70EN02 9.45E+00 YSSDPTGALTEDSIDDTFLPVPEyINQSVPK R2 3.08EN02 1.23EN01 1.26EN01 1.90EN01 2.87EN01 1.26EN01 8.84EN02 2.85EN02 1.10E+01 YSSDPTGALTEDSIDDTFLPVPEyINQSVPK R3 2.45EN02 1.55EN01 1.18EN01 1.84EN01 2.69EN01 1.32EN01 8.74EN02 3.00EN02 1.00E+01 GSHQISLDNPDyQQDFFPK R1 3.40EN02 1.64EN01 1.07EN01 1.32EN01 2.06EN01 1.47EN01 1.42EN01 6.73EN02 6.44E+01 GSHQISLDNPDyQQDFFPK R2 3.65EN02 1.56EN01 1.09EN01 1.34EN01 2.17EN01 1.48EN01 1.34EN01 6.51EN02 7.09E+01 GSHQISLDNPDyQQDFFPK R3 2.80EN02 1.58EN01 1.08EN01 1.36EN01 2.09EN01 1.56EN01 1.37EN01 6.82EN02 4.90E+01 GSTAENAEyLR R1 4.15EN02 1.93EN02 1.31EN01 1.32EN01 2.88EN01 2.11EN01 1.38EN01 3.97EN02 1.66E+01 GSTAENAEyLR R2 2.85EN02 1.58EN01 1.15EN01 1.18EN01 2.44EN01 1.83EN01 1.16EN01 3.75EN02 1.03E+01 GSTAENAEyLR R3 2.68EN02 1.54EN01 1.17EN01 1.20EN01 2.48EN01 1.83EN01 1.15EN01 3.57EN02 9.41E+00 ySSDPTGALTEDSIDDTFLPVPEyINQSVPK R1 1.13EN01 1.87EN01 1.07EN01 1.04EN01 1.30EN01 1.41EN01 1.33EN01 8.47EN02 4.15E+01 ySSDPTGALTEDSIDDTFLPVPEyINQSVPK R2 1.00EN01 1.50EN01 1.19EN01 1.18EN01 1.72EN01 1.48EN01 1.27EN01 6.60EN02 3.65E+01 ySSDPTGALTEDSIDDTFLPVPEyINQSVPK R3 5.50EN02 1.77EN01 1.16EN01 1.18EN01 1.52EN01 1.55EN01 1.49EN01 7.72EN02 2.13E+01 GSHQIsLDNPDyQQDFFPK R1 2.15EN01 1.47EN01 8.54EN02 1.20EN01 1.70EN01 1.01EN01 8.66EN02 7.58EN02 8.42E+00 GSHQIsLDNPDyQQDFFPK R2 2.22EN01 1.57EN01 9.77EN02 1.01EN01 1.17EN01 9.58EN02 1.00EN01 1.09EN01 9.35E+00 GSHQIsLDNPDyQQDFFPK R3 3.17EN01 1.82EN01 8.44EN02 8.46EN02 1.57EN01 1.02EN01 5.08EN02 2.32EN02 1.29E+01 IADPEHDHTGFLTEyVATR R1 1.17EN02 1.19EN02 1.02EN02 3.74EN02 2.57EN01 2.48EN01 2.41EN01 1.83EN01 1.97E+01 IADPEHDHTGFLTEyVATR R2 1.12EN02 1.41EN02 1.24EN02 3.63EN02 2.49EN01 2.45EN01 2.32EN01 2.00EN01 1.97E+01 IADPEHDHTGFLTEyVATR R3 1.40EN02 1.40EN02 1.27EN02 3.96EN02 2.49EN01 2.47EN01 2.32EN01 1.91EN01 2.30E+01 IADPEHDHTGFLtEyVATR R1 2.94EN02 1.86EN02 1.44EN02 2.11EN02 2.41EN01 2.51EN01 2.28EN01 1.96EN01 2.03E+00 IADPEHDHTGFLtEyVATR R2 4.60EN02 3.51EN02 1.67EN02 2.66EN02 2.24EN01 2.37EN01 2.20EN01 1.95EN01 3.08E+00 IADPEHDHTGFLtEyVATR R3 4.29EN02 2.53EN02 1.23EN02 2.36EN02 2.26EN01 2.46EN01 2.24EN01 2.00EN01 3.18E+00 VADPDHDHTGFLTEyVATR R1 8.77EN03 9.70EN03 8.29EN03 3.41EN02 2.56EN01 2.58EN01 2.44EN01 1.80EN01 3.75E+01 VADPDHDHTGFLTEyVATR R2 9.98EN03 9.76EN03 8.62EN03 3.41EN02 2.48EN01 2.57EN01 2.40EN01 1.93EN01 4.74E+01 VADPDHDHTGFLTEyVATR R3 9.98EN03 1.23EN02 9.64EN03 3.51EN02 2.47EN01 2.52EN01 2.45EN01 1.90EN01 4.76E+01 VADPDHDHTGFLtEyVATR R1 1.57EN02 9.82EN03 8.56EN03 1.92EN02 2.27EN01 2.66EN01 2.43EN01 2.11EN01 4.75E+00 VADPDHDHTGFLtEyVATR R2 2.14EN02 1.50EN02 8.45EN03 1.87EN02 2.23EN01 2.62EN01 2.38EN01 2.14EN01 6.16E+00 VADPDHDHTGFLtEyVATR R3 2.18EN02 1.93EN02 9.45EN03 1.89EN02 2.17EN01 2.70EN01 2.23EN01 2.20EN01 6.38E+00 ELFDDPSyVNVQNLDK R1 4.48EN03 1.72EN01 1.22EN01 1.58EN01 2.13EN01 1.46EN01 1.23EN01 6.13EN02 2.21E+01 ELFDDPSyVNVQNLDK R2 6.46EN03 1.77EN01 1.17EN01 1.52EN01 2.12EN01 1.44EN01 1.21EN01 7.01EN02 3.15E+01 ELFDDPSyVNVQNLDK R3 5.41EN03 1.64EN01 1.24EN01 1.59EN01 2.17EN01 1.46EN01 1.25EN01 5.97EN02 2.51E+01 IQNTGDyYDLYGGEK R1 1.47EN01 1.40EN01 1.09EN01 1.25EN01 1.26EN01 9.27EN02 1.30EN01 1.30EN01 1.55E+01 IQNTGDyYDLYGGEK R2 1.69EN01 1.39EN01 1.05EN01 1.04EN01 1.43EN01 9.46EN02 1.14EN01 1.31EN01 1.49E+01 IQNTGDyYDLYGGEK R3 1.48EN01 1.45EN01 9.96EN02 1.07EN01 1.31EN01 1.07EN01 1.27EN01 1.35EN01 1.68E+01 LIEDNEyTAR R1 6.60EN01 2.57EN01 4.89EN02 1.95EN02 8.96EN03 2.45EN03 2.79EN03 5.69EN04 9.97E+01 LIEDNEyTAR R2 6.57EN01 2.53EN01 5.40EN02 2.01EN02 9.95EN03 3.21EN03 1.48EN03 1.01EN03 9.44E+01 LIEDNEyTAR R3 6.23EN01 2.90EN01 5.24EN02 2.03EN02 9.31EN03 2.68EN03 1.40EN03 9.13EN04 8.45E+01 114 6.90E+01 6.39E+01 8.14E+01 5.80E+01 4.38E+01 6.32E+01 3.11E+02 3.04E+02 2.76E+02 7.74E+00 5.73E+01 5.40E+01 6.85E+01 5.47E+01 6.88E+01 5.76E+00 6.59E+00 7.40E+00 2.00E+01 2.48E+01 2.31E+01 1.29E+00 2.36E+00 1.88E+00 4.15E+01 4.63E+01 5.85E+01 2.98E+00 4.32E+00 5.66E+00 8.49E+02 8.61E+02 7.62E+02 1.47E+01 1.22E+01 1.64E+01 3.88E+01 3.64E+01 3.93E+01 115 5.13E+01 4.22E+01 4.94E+01 4.41E+01 4.51E+01 4.81E+01 2.03E+02 2.12E+02 1.90E+02 5.25E+01 4.16E+01 4.13E+01 3.91E+01 4.36E+01 4.51E+01 3.35E+00 4.12E+00 3.44E+00 1.72E+01 2.19E+01 2.10E+01 9.98EN01 1.12E+00 9.11EN01 3.55E+01 4.09E+01 4.60E+01 2.60E+00 2.43E+00 2.77E+00 6.02E+02 5.69E+02 5.76E+02 1.15E+01 9.24E+00 1.13E+01 7.39E+00 7.76E+00 7.10E+00 116 6.04E+01 4.71E+01 5.42E+01 7.47E+01 6.80E+01 7.53E+01 2.50E+02 2.61E+02 2.38E+02 5.28E+01 4.26E+01 4.21E+01 3.80E+01 4.29E+01 4.56E+01 4.72E+00 4.27E+00 3.45E+00 6.28E+01 6.38E+01 6.52E+01 1.46E+00 1.79E+00 1.75E+00 1.46E+02 1.62E+02 1.67E+02 5.83E+00 5.37E+00 5.53E+00 7.77E+02 7.42E+02 7.40E+02 1.32E+01 9.17E+00 1.20E+01 2.94E+00 2.88E+00 2.75E+00 117 7.21E+01 6.39E+01 7.03E+01 9.73E+01 1.03E+02 1.10E+02 3.90E+02 4.21E+02 3.66E+02 1.16E+02 8.85E+01 8.74E+01 4.77E+01 6.29E+01 5.90E+01 6.66E+00 4.94E+00 6.39E+00 4.32E+02 4.38E+02 4.11E+02 1.67E+01 1.50E+01 1.67E+01 1.10E+03 1.17E+03 1.18E+03 6.89E+01 6.42E+01 6.35E+01 1.05E+03 1.03E+03 1.01E+03 1.33E+01 1.25E+01 1.48E+01 1.35E+00 1.43E+00 1.26E+00 118 7.33E+01 6.02E+01 7.53E+01 4.45E+01 4.50E+01 5.40E+01 2.78E+02 2.88E+02 2.73E+02 8.46E+01 6.64E+01 6.43E+01 5.15E+01 5.38E+01 6.00E+01 3.97E+00 4.04E+00 4.14E+00 4.16E+02 4.30E+02 4.08E+02 1.74E+01 1.59E+01 1.83E+01 1.11E+03 1.22E+03 1.20E+03 8.06E+01 7.53E+01 7.91E+01 7.20E+02 7.03E+02 6.77E+02 9.77E+00 8.32E+00 1.21E+01 3.70EN01 4.62EN01 3.63EN01 119 7.23E+01 6.30E+01 6.68E+01 3.45E+01 3.16E+01 3.57E+01 2.69E+02 2.61E+02 2.40E+02 5.51E+01 4.19E+01 4.05E+01 4.88E+01 4.65E+01 5.78E+01 3.40E+00 4.22E+00 2.07E+00 4.04E+02 4.09E+02 3.83E+02 1.58E+01 1.48E+01 1.66E+01 1.04E+03 1.14E+03 1.17E+03 7.37E+01 6.84E+01 6.54E+01 6.08E+02 5.90E+02 5.81E+02 1.37E+01 1.01E+01 1.44E+01 4.22EN01 2.13EN01 1.90EN01 121 3.74E+01 3.00E+01 3.57E+01 1.00E+01 1.02E+01 1.23E+01 1.27E+02 1.27E+02 1.20E+02 1.59E+01 1.36E+01 1.25E+01 3.10E+01 2.41E+01 2.99E+01 2.97E+00 4.58E+00 9.47EN01 3.07E+02 3.52E+02 3.15E+02 1.36E+01 1.31E+01 1.48E+01 7.73E+02 9.18E+02 9.05E+02 6.42E+01 6.16E+01 6.44E+01 3.02E+02 3.41E+02 2.77E+02 1.37E+01 1.15E+01 1.53E+01 8.60EN02 1.45EN01 1.24EN01 130 1045 1068 1148 1173 1045/1068 1142/1148 204 202/204 187 185/187 317 62 418 Site 1045 1068 1148 1173 1045,)1068 1142,)1148 204 202,)204 187 185,)187 317 62 418 Protein EGFR EGFR EGFR EGFR EGFR EGFR ERK1 ERK1 ERK2 ERK2 Shc Shp2 Src Site Protein EGFR EGFR EGFR EGFR EGFR EGFR ERK1 ERK1 ERK2 ERK2 Shc Shp2 Src PRM MRM Amount)(fmol) StdDev)(fmol) Sequence 0)sec)(113) )30)sec)(114) )1)min)(115) )2)min)(116) )3)min)(117) )5)min)(117) )10)min)(119) )30)min)(121) 0)sec)(113) )30)sec)(114) )1)min)(115) )2)min)(116) )3)min)(117) )5)min)(117) )10)min)(119) )30)min)(121) ySSDPTGALTEDSIDDTFLPVPEYINQSVPK 1.01E+01 7.14E+01 4.76E+01 5.39E+01 6.88E+01 6.96E+01 6.74E+01 3.43E+01 2.15E+00 9.03E+00 4.80E+00 6.67E+00 4.31E+00 8.21E+00 4.67E+00 3.91E+00 YSSDPTGALTEDSIDDTFLPVPEyINQSVPK 1.02E+01 5.50E+01 4.57E+01 7.26E+01 1.03E+02 4.78E+01 3.39E+01 1.08E+01 7.79ER01 1.00E+01 2.10E+00 4.06E+00 6.40E+00 5.37E+00 2.15E+00 1.25E+00 GSHQISLDNPDyQQDFFPK 6.14E+01 2.97E+02 2.02E+02 2.50E+02 3.92E+02 2.80E+02 2.57E+02 1.24E+02 1.12E+01 1.82E+01 1.09E+01 1.12E+01 2.78E+01 7.59E+00 1.49E+01 4.32E+00 GSTAENAEyLR 1.21E+01 3.97E+01 4.51E+01 4.59E+01 9.71E+01 7.18E+01 4.59E+01 1.40E+01 3.94E+00 2.77E+01 6.35E+00 6.04E+00 1.59E+01 1.11E+01 8.05E+00 1.71E+00 ySSDPTGALTEDSIDDTFLPVPEyINQSVPK 3.31E+01 6.40E+01 4.26E+01 4.22E+01 5.65E+01 5.51E+01 5.10E+01 2.83E+01 1.05E+01 8.06E+00 3.13E+00 3.84E+00 7.88E+00 4.38E+00 6.00E+00 3.74E+00 GSHQIsLDNPDyQQDFFPK 1.02E+01 6.58E+00 3.64E+00 4.14E+00 6.00E+00 4.05E+00 3.23E+00 2.83E+00 2.37E+00 8.22ER01 4.18ER01 6.42ER01 9.24ER01 8.64ER02 1.09E+00 1.82E+00 IADPEHDHTGFLTEyVATR 2.08E+01 2.26E+01 2.00E+01 6.40E+01 4.27E+02 4.18E+02 3.99E+02 3.25E+02 1.93E+00 2.46E+00 2.49E+00 1.20E+00 1.43E+01 1.12E+01 1.39E+01 2.39E+01 IADPEHDHTGFLtEyVATR 2.77E+00 1.84E+00 1.01E+00 1.67E+00 1.61E+01 1.72E+01 1.57E+01 1.38E+01 6.36ER01 5.37ER01 1.04ER01 1.82ER01 9.90ER01 1.19E+00 9.25ER01 9.16ER01 VADPDHDHTGFLTEyVATR 4.42E+01 4.88E+01 4.08E+01 1.58E+02 1.15E+03 1.18E+03 1.12E+03 8.65E+02 5.74E+00 8.72E+00 5.24E+00 1.10E+01 4.48E+01 5.99E+01 6.39E+01 8.03E+01 VADPDHDHTGFLtEyVATR 5.77E+00 4.32E+00 2.60E+00 5.58E+00 6.55E+01 7.83E+01 6.92E+01 6.34E+01 8.85ER01 1.34E+00 1.67ER01 2.36ER01 2.96E+00 2.75E+00 4.20E+00 1.55E+00 ELFDDPSyVNVQNLDK 2.62E+01 8.24E+02 5.82E+02 7.53E+02 1.03E+03 7.00E+02 5.93E+02 3.07E+02 4.80E+00 5.42E+01 1.75E+01 2.11E+01 2.07E+01 2.19E+01 1.37E+01 3.22E+01 IQNTGDyYDLYGGEK 1.57E+01 1.44E+01 1.07E+01 1.15E+01 1.35E+01 1.01E+01 1.27E+01 1.35E+01 9.61ER01 2.11E+00 1.25E+00 2.07E+00 1.16E+00 1.89E+00 2.32E+00 1.91E+00 LIEDNEyTAR 9.29E+01 3.81E+01 7.42E+00 2.86E+00 1.35E+00 3.98ER01 2.75ER01 1.18ER01 7.70E+00 1.56E+00 3.31ER01 1.01ER01 8.36ER02 5.51ER02 1.28ER01 2.98ER02 Amount)(fmol) StdDev)(fmol) Sequence 0)sec)(113) )30)sec)(114) )1)min)(115) )2)min)(116) )3)min)(117) )5)min)(117) )10)min)(119) )30)min)(121) 0)sec)(113) )30)sec)(114) )1)min)(115) )2)min)(116) )3)min)(117) )5)min)(117) )10)min)(119) )30)min)(121) ySSDPTGALTEDSIDDTFLPVPEYINQSVPK 1.55E+01 5.37E+01 6.48E+01 1.34E+02 9.53E+01 6.20E+01 5.46E+01 1.52E+01 2.85E+00 6.73E+00 2.99E+00 3.06E+01 1.75E+00 4.03E+00 3.95E+00 1.24E+00 YSSDPTGALTEDSIDDTFLPVPEyINQSVPK 3.78E+01 1.35E+02 1.17E+02 1.59E+02 1.10E+02 1.57E+02 1.33E+02 6.37E+01 1.57E+01 3.16E+00 1.23E+01 1.05E+01 1.82E+01 4.45E+00 2.92E+01 8.39E+00 GSHQISLDNPDyQQDFFPK 5.05E+01 3.03E+02 3.10E+02 4.57E+02 4.15E+02 4.23E+02 3.69E+02 1.55E+02 1.19E+01 2.49E+01 4.36E+01 4.19E+01 1.90E+01 1.96E+01 1.99E+01 1.59E+01 GSTAENAEyLR 1.33E+01 7.92E+01 7.25E+01 9.85E+01 1.03E+02 1.06E+02 6.74E+01 1.86E+01 4.39ER01 3.00E+00 5.34E+00 6.11E+00 5.01E+00 6.33E+00 4.06E+00 7.66ER01 ySSDPTGALTEDSIDDTFLPVPEyINQSVPK 3.12E+01 1.52E+02 1.55E+02 2.57E+02 1.68E+02 1.28E+02 8.45E+01 2.14E+01 6.42E+00 6.58E+00 3.41E+01 3.65E+01 1.82E+01 2.72E+01 1.69E+01 4.15E+00 GSHQIsLDNPDyQQDFFPK 7.56E+01 3.83E+01 1.98E+01 3.68E+01 1.83E+01 2.65E+01 1.14E+01 4.35E+00 3.61E+01 1.49E+01 5.78E+00 9.42E+00 2.73E+00 4.82E+00 4.65E+00 4.67ER01 IADPEHDHTGFLTEyVATR 5.73E+01 4.45E+01 8.39E+01 2.20E+02 3.85E+02 5.98E+02 5.28E+02 4.41E+02 9.11E+00 6.31E+00 9.52E+00 1.84E+01 4.95E+01 4.41E+01 2.13E+01 6.25E+01 IADPEHDHTGFLtEyVATR 8.30E+00 5.23E+00 5.87E+00 1.04E+01 1.68E+01 2.74E+01 2.26E+01 1.65E+01 7.85ER01 1.10E+00 2.24E+00 3.38E+00 1.71E+00 5.37E+00 3.82ER01 1.69E+00 VADPDHDHTGFLTEyVATR 1.03E+02 8.64E+01 2.04E+02 5.33E+02 1.05E+03 1.60E+03 1.38E+03 1.02E+03 3.70E+00 3.92E+00 9.24E+00 4.15E+01 3.78E+01 4.85E+01 4.20E+01 5.54E+01 VADPDHDHTGFLtEyVATR 1.58E+01 8.65E+00 1.30E+01 1.87E+01 7.04E+01 1.28E+02 1.02E+02 9.02E+01 9.67E+00 8.16ER01 4.24E+00 1.91E+00 1.15E+01 1.55E+01 3.14E+00 6.32E+00 ELFDDPSyVNVQNLDK 4.64E+01 9.71E+02 8.45E+02 1.46E+03 1.05E+03 1.20E+03 8.88E+02 4.26E+02 8.62E+00 6.40E+01 7.26E+01 2.56E+02 1.64E+02 1.11E+02 7.63E+01 4.11E+01 IQNTGDyYDLYGGEK 2.46E+01 2.35E+01 2.00E+01 2.54E+01 1.68E+01 2.17E+01 2.89E+01 2.19E+01 3.12E+00 1.15E+00 2.31E+00 2.59E+00 2.24E+00 1.70E+00 2.81E+00 2.93E+00 LIEDNEyTAR 2.29E+01 3.07E+01 2.99E+01 4.29E+01 3.57E+01 3.37E+01 1.80E+01 1.79E+01 1.17E+00 4.74E+00 1.43E+00 8.19E+00 3.25E+00 4.18E+00 1.32E+00 6.69ER01 Table 3-­‐3: Comparison of MRM and PRM quantification for 13 tyrosine phosphorylated peptides. Absolute quantification values from Appendix Table 3-­‐1 and Appendix Table 3-­‐2 are combined here for easy comparison of the two mass spectrometric platforms. Columns D-­‐K contain the mean amount of the peptide as measured by MRM or PRM; columns L-­‐S contain calculated standard deviations from the technical replicate data in the respective tables. Finally, columns T-­‐AA contain the corresponding coefficient of variation for each peptide, as calculated by standard deviation divided by the mean. Table continued on next page. 131 Site 1045 1068 1148 1173 1045/1068 1142/1148 204 202/204 187 185/187 317 62 418 CV 01sec1(113) 1301sec1(114) 111min1(115) 121min1(116) 131min1(117) 151min1(117) 1101min1(119) 1301min1(121) 2.14EM01 1.26EM01 1.01EM01 1.24EM01 6.27EM02 1.18EM01 6.94EM02 1.14EM01 7.67EM02 1.82EM01 4.60EM02 5.59EM02 6.20EM02 1.12EM01 6.34EM02 1.15EM01 1.83EM01 6.13EM02 5.40EM02 4.49EM02 7.07EM02 2.71EM02 5.80EM02 3.47EM02 3.25EM01 6.98EM01 1.41EM01 1.32EM01 1.64EM01 1.55EM01 1.76EM01 1.22EM01 3.17EM01 1.26EM01 7.34EM02 9.10EM02 1.39EM01 7.95EM02 1.18EM01 1.32EM01 2.32EM01 1.25EM01 1.15EM01 1.55EM01 1.54EM01 2.13EM02 3.36EM01 6.42EM01 9.25EM02 1.09EM01 1.24EM01 1.88EM02 3.34EM02 2.68EM02 3.49EM02 7.35EM02 2.30EM01 2.92EM01 1.03EM01 1.09EM01 6.13EM02 6.95EM02 5.89EM02 6.63EM02 1.30EM01 1.79EM01 1.29EM01 6.97EM02 3.89EM02 5.10EM02 5.72EM02 9.28EM02 1.54EM01 3.10EM01 6.43EM02 4.23EM02 4.52EM02 3.51EM02 6.07EM02 2.45EM02 1.83EM01 6.58EM02 3.01EM02 2.80EM02 2.01EM02 3.13EM02 2.31EM02 1.05EM01 6.10EM02 1.46EM01 1.17EM01 1.80EM01 8.54EM02 1.88EM01 1.83EM01 1.41EM01 8.29EM02 4.08EM02 4.47EM02 3.53EM02 6.20EM02 1.39EM01 4.65EM01 2.52EM01 CV 01sec1(113) 1301sec1(114) 111min1(115) 121min1(116) 131min1(117) 151min1(117) 1101min1(119) 1301min1(121) 1.84EM01 1.25EM01 4.61EM02 2.28EM01 1.84EM02 6.50EM02 7.24EM02 8.14EM02 4.17EM01 2.35EM02 1.05EM01 6.59EM02 1.65EM01 2.83EM02 2.20EM01 1.32EM01 2.35EM01 8.22EM02 1.41EM01 9.16EM02 4.58EM02 4.64EM02 5.37EM02 1.03EM01 3.30EM02 3.79EM02 7.37EM02 6.20EM02 4.87EM02 5.96EM02 6.02EM02 4.13EM02 2.06EM01 4.34EM02 2.19EM01 1.42EM01 1.08EM01 2.13EM01 2.01EM01 1.94EM01 4.78EM01 3.88EM01 2.92EM01 2.56EM01 1.50EM01 1.82EM01 4.08EM01 1.07EM01 1.59EM01 1.42EM01 1.13EM01 8.34EM02 1.29EM01 7.39EM02 4.03EM02 1.42EM01 9.46EM02 2.09EM01 3.81EM01 3.27EM01 1.02EM01 1.96EM01 1.69EM02 1.02EM01 3.58EM02 4.53EM02 4.53EM02 7.80EM02 3.60EM02 3.04EM02 3.04EM02 5.41EM02 6.13EM01 9.44EM02 3.26EM01 1.02EM01 1.63EM01 1.21EM01 3.09EM02 7.01EM02 1.86EM01 6.59EM02 8.59EM02 1.75EM01 1.56EM01 9.24EM02 8.60EM02 9.64EM02 1.27EM01 4.88EM02 1.16EM01 1.02EM01 1.33EM01 7.84EM02 9.73EM02 1.34EM01 5.12EM02 1.54EM01 4.77EM02 1.91EM01 9.10EM02 1.24EM01 7.34EM02 3.74EM02 MRM Protein EGFR EGFR EGFR EGFR EGFR EGFR ERK1 ERK1 ERK2 ERK2 Shc Shp2 Src PRM Site Sequence 1045 ySSDPTGALTEDSIDDTFLPVPEYINQSVPK 1068 YSSDPTGALTEDSIDDTFLPVPEyINQSVPK 1148 GSHQISLDNPDyQQDFFPK 1173 GSTAENAEyLR 1045,11068 ySSDPTGALTEDSIDDTFLPVPEyINQSVPK 1142,11148 GSHQIsLDNPDyQQDFFPK 204 IADPEHDHTGFLTEyVATR 202,1204 IADPEHDHTGFLtEyVATR 187 VADPDHDHTGFLTEyVATR 185,1187 VADPDHDHTGFLtEyVATR 317 ELFDDPSyVNVQNLDK 62 IQNTGDyYDLYGGEK 418 LIEDNEyTAR Sequence ySSDPTGALTEDSIDDTFLPVPEYINQSVPK YSSDPTGALTEDSIDDTFLPVPEyINQSVPK GSHQISLDNPDyQQDFFPK GSTAENAEyLR ySSDPTGALTEDSIDDTFLPVPEyINQSVPK GSHQIsLDNPDyQQDFFPK IADPEHDHTGFLTEyVATR IADPEHDHTGFLtEyVATR VADPDHDHTGFLTEyVATR VADPDHDHTGFLtEyVATR ELFDDPSyVNVQNLDK IQNTGDyYDLYGGEK LIEDNEyTAR Protein EGFR EGFR EGFR EGFR EGFR EGFR ERK1 ERK1 ERK2 ERK2 Shc Shp2 Src 132 Protein EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR Site 1045 1045 1045 1045 1045 1045 1045 1045 1045 1068 1068 1068 1068 1068 1068 1068 1068 1068 1148 1148 1148 1148 1148 1148 1148 1148 1148 1173 1173 1173 1173 1173 1173 1173 1173 1173 Sequence ySSDPTGALTEDSIDDTFLPVPEYINQSVPK ySSDPTGALTEDSIDDTFLPVPEYINQSVPK ySSDPTGALTEDSIDDTFLPVPEYINQSVPK ySSDPTGALTEDSIDDTFLPVPEYINQSVPK ySSDPTGALTEDSIDDTFLPVPEYINQSVPK ySSDPTGALTEDSIDDTFLPVPEYINQSVPK ySSDPTGALTEDSIDDTFLPVPEYINQSVPK ySSDPTGALTEDSIDDTFLPVPEYINQSVPK ySSDPTGALTEDSIDDTFLPVPEYINQSVPK YSSDPTGALTEDSIDDTFLPVPEyINQSVPK YSSDPTGALTEDSIDDTFLPVPEyINQSVPK YSSDPTGALTEDSIDDTFLPVPEyINQSVPK YSSDPTGALTEDSIDDTFLPVPEyINQSVPK YSSDPTGALTEDSIDDTFLPVPEyINQSVPK YSSDPTGALTEDSIDDTFLPVPEyINQSVPK YSSDPTGALTEDSIDDTFLPVPEyINQSVPK YSSDPTGALTEDSIDDTFLPVPEyINQSVPK YSSDPTGALTEDSIDDTFLPVPEyINQSVPK GSHQISLDNPDyQQDFFPK GSHQISLDNPDyQQDFFPK GSHQISLDNPDyQQDFFPK GSHQISLDNPDyQQDFFPK GSHQISLDNPDyQQDFFPK GSHQISLDNPDyQQDFFPK GSHQISLDNPDyQQDFFPK GSHQISLDNPDyQQDFFPK GSHQISLDNPDyQQDFFPK GSTAENAEyLR GSTAENAEyLR GSTAENAEyLR GSTAENAEyLR GSTAENAEyLR GSTAENAEyLR GSTAENAEyLR GSTAENAEyLR GSTAENAEyLR Run TGFa20_2F TGFa20_2R AREG20_2F AREG20_2R AREG20_2R_reip AREG20_3R EGF20_2F EGF20_2R T0_R1Q4 TGFa20_2F TGFa20_2R AREG20_2F AREG20_2R AREG20_2R_reip AREG20_3R EGF20_2F EGF20_2R T0_R1Q4 TGFa20_2F TGFa20_2R AREG20_2F AREG20_2R AREG20_2R_reip AREG20_3R EGF20_2F EGF20_2R T0_R1Q4 TGFa20_2F TGFa20_2R AREG20_2F AREG20_2R AREG20_2R_reip AREG20_3R EGF20_2F EGF20_2R T0_R1Q4 1.65 1.65 1.65 1.65 4.2185 4.2185 4.2185 1.65 1650 1.65 1650 4218.5 4.2185 4218.5 1.2946 1294.6 1.2946 1294.6 1650 1.65 1650 1.65 1285 1.285 1285 1.285 22464330.4 405339.84 31953061.4 60758.5568 1152088.04 60782802.9 915936.343 3105999.74 31988.7518 2892876.27 7252.3719 283240.689 26490254.3 330465.468 1.2946 1.2946 1.2946 1.2946 1.65 1.65 1.65 1.65 1.285 1.285 1.285 1586230.04 540101.454 2355083.42 75803.2042 306171.327 3781062.7 224775.961 240101.17 70002.2192 257964.225 25370.9232 592955.125 1782047.6 616660.437 2078197.98 16623.823 2109259.21 8780.8869 36013.4135 2461297.79 24373.5242 1.285 361623.11 3201285.31 599237.386 483084.345 2028782.52 711848.114 1545129.73 75063.4294 368897.128 84694.6675 115117.267 3798736.53 504810.12 3718268.33 181517.614 56211.4739 193887.003 19768.6516 75616.4282 191106.393 52151.4385 24935.485 73783.9356 404404.545 71140.2585 31109240.4 144575.651 4169049.66 27822309.9 278003.421 22833581.5 17801.0882 2813968.15 28893.7979 895075.422 34552290.8 117485.42 33839545.9 57705.3387 254331.422 76422.9384 80957.9877 422733.041 66758.1158 334295.231 6589826.3 387698.472 936008.924 859785.58 5411.0798 463156.302 1088050.9 256045.716 88255.557 137703.777 70433.5006 124446.646 386200.326 1438820.99 1544939.78 8439.8324 1120075.07 2012536.23 303642.879 85996.5019 164473.725 111574.901 3349.7328 778650.031 592986.114 518026.033 13585.1209 1943578.19 16171.6744 748871.186 3247435.33 21750.9756 3159991.04 1611514.11 1542548.18 1850207.45 934742.224 42268843.3 2823966.17 430266.16 1799344.88 2013990.43 2113633.95 2602489.5 1310842.89 218722.477 655669.423 4027663.94 60320821.1 30135.7616 9210.3221 11046.2875 14032.5401 5213290.05 1009423.64 21848.139 220876.265 400774.058 1198508.86 1229227.1 15658.4164 971612.58 2023703.02 579138.706 330102.83 1228138.53 1270574.07 76005.778 1125089.38 2327172.92 316532.167 82849.3843 213448.013 197912.572 1187.6487 1353992.99 920842.677 591211.303 20942.1133 13890.3321 22478.9089 153.1698 127742.722 85541.2816 88019.0604 4.2185 4218.5 4.2185 4218.5 1285 1.285 1285 82101.9474 157396.218 135354.13 1032.4934 1377923.03 891269.564 988889.838 57761.4405 135905.874 128739.531 5152.1284 1506375.58 1008633.94 649898.567 20241.9845 41931.7371 40834.3373 Q61.9151 187296.029 137645.685 86457.6259 4.2185 4.2185 4.2185 4.2185 1.285 1.285 1.285 28759.5877 61808.5016 55470.8385 1376.4434 170006.982 111717.691 102929.821 19008.5293 72021.1728 59656.0461 2398.6719 165740.555 126192.911 76911.7599 128.5 12.85 128.5 12.85 42.185 421.85 42.185 421.85 128.5 12.85 128.5 16.5 165 16.5 165 421.85 42.185 421.85 12.946 129.46 12.946 129.46 165 16.5 165 16.5 Endogenous)iTRAQ)Integrated)Intensity Standard)iTRAQ)Integrated)Intensity Standard)Amount 114 115 116 117 114 115 116 117 AAA)Multiple 114 115 66580.6501 61353.6538 37742.9585 19082.5742 1629552.24 184678.584 48031.3687 9210.4672 1.2946 1294.6 129.46 70478.9727 79230.9432 38445.7845 27398.4954 20349.0189 65520.9974 176406.851 2087415.73 1.2946 1.2946 12.946 15402.5749 37933.0061 22469.8616 9686.7124 2120167.23 267327.631 50846.3534 13907.3872 1.2946 1294.6 129.46 1.2946 1.2946 12.946 12.85 128.5 12.85 128.5 421.85 42.185 421.85 42.185 12.85 128.5 12.85 165 16.5 165 16.5 42.185 421.85 42.185 129.46 12.946 129.46 12.946 16.5 165 16.5 165 116 12.946 129.46 12.946 129.46 1.285 1285 1.285 1285 4218.5 4.2185 4218.5 4.2185 1.285 1285 1.285 1650 1.65 1650 1.65 4.2185 4218.5 4.2185 1294.6 1.2946 1294.6 1.2946 1.65 1650 1.65 1650 117 1.2946 1294.6 1.2946 1294.6 12.093 7.1347 12.593 0.22236 50.5747 58.1257 39.0445 45.5766 74.9692 59.1986 4.779 79.3376 103.1597 53.954 75.8023 110.3609 115.8088 4.5273 23.9989 27.2594 17.2672 28.9826 34.123 31.4931 6.1323 43.9087 97.2924 105.007 1.6944 137.96 160.1957 61.5765 75.5094 169.6454 208.0956 7.0614 58.0378 50.4209 20.4771 16.4034 67.5422 62.525 4.8273 58.8287 48.3863 13.1695 20.554 64.5413 144.8047 171.14 63.5925 58.2976 58.3036 21.3463 30.6125 61.9476 68.3057 7.3505 153.4864 175.4684 67.6889 56.311 50.7802 4.0249 32.07 78.3588 141.3112 165.5707 13.1011 50.3451 50.7007 39.056 43.5125 71.7432 71.8152 1.473 140.3982 155.051 102.9959 16.6072 31.7476 31.0757 1.9983 67.3075 75.9127 45.6588 11.6887 21.538 22.8761 Q0.089907 74.1525 93.5309 38.3518 10.9765 36.9932 33.4203 3.4824 65.6184 85.7486 34.1174 117 12.1669 16.3904 4.5719 Calculated)Endogenous)Amount 114 115 116 42.4513 39.1186 24.0646 42.1623 47.3979 22.9992 7.2696 17.9033 10.6052 Table 3-­‐4: Ligand-­‐Specific EGFR Absolute Quantification. Integrated intensity for each iTRAQ reporter generated from endogenous transitions for key EGFR C-­‐terminal phosphorylation sites (columns E-­‐H). Integrated iTRAQ reporter ion intensity from standard peptide transitions (columns I-­‐L). Standard peptide concentration was determined by amino acid analysis and a correction factor was applied to the nominal doses included (column M). Actual standard dose included into each channel following AAA correction (columns N-­‐Q). The standard curve built from the standard amount (columns N-­‐Q) and the standard intensities (columns I-­‐L) was used to calculate the endogenous amount of the corresponding peptide in each sample (columns R-­‐U) based on the endogenous intensities (columns E-­‐ H). Prior to MS analysis, the number of cells contained in the sample was calculated with a BCA assay; standard cell number stock lysate solutions were used to calibrate the unknown samples (columns V-­‐Y). Total amount of endogenous peptide (columns R-­‐ U) was converted to moles and divided by the corresponding cell number (columns V-­‐ Y) to produce molecules/cell (columns Z-­‐AC). The mean and range of two biological replicates is reported (columns AG-­‐AK) for each ligand. To mitigate contamination from standard peptide iTRAQ signals, several precautions were taken. First, standard doses and iTRAQ channels were varied between replicates. Second, four time-­‐zero replicates were analyzed separately from other samples, because this condition was anticipated to have low signal. In addition, the biological replicate in the channel containing the 1 pmol standard dose were excluded, because significant residual signal from the standard peptide was visible. Table continued on next page. 133 Protein EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR EGFR Site 1045 1045 1045 1045 1045 1045 1045 1045 1045 1068 1068 1068 1068 1068 1068 1068 1068 1068 1148 1148 1148 1148 1148 1148 1148 1148 1148 1173 1173 1173 1173 1173 1173 1173 1173 1173 Sequence ySSDPTGALTEDSIDDTFLPVPEYINQSVPK ySSDPTGALTEDSIDDTFLPVPEYINQSVPK ySSDPTGALTEDSIDDTFLPVPEYINQSVPK ySSDPTGALTEDSIDDTFLPVPEYINQSVPK ySSDPTGALTEDSIDDTFLPVPEYINQSVPK ySSDPTGALTEDSIDDTFLPVPEYINQSVPK ySSDPTGALTEDSIDDTFLPVPEYINQSVPK ySSDPTGALTEDSIDDTFLPVPEYINQSVPK ySSDPTGALTEDSIDDTFLPVPEYINQSVPK YSSDPTGALTEDSIDDTFLPVPEyINQSVPK YSSDPTGALTEDSIDDTFLPVPEyINQSVPK YSSDPTGALTEDSIDDTFLPVPEyINQSVPK YSSDPTGALTEDSIDDTFLPVPEyINQSVPK YSSDPTGALTEDSIDDTFLPVPEyINQSVPK YSSDPTGALTEDSIDDTFLPVPEyINQSVPK YSSDPTGALTEDSIDDTFLPVPEyINQSVPK YSSDPTGALTEDSIDDTFLPVPEyINQSVPK YSSDPTGALTEDSIDDTFLPVPEyINQSVPK GSHQISLDNPDyQQDFFPK GSHQISLDNPDyQQDFFPK GSHQISLDNPDyQQDFFPK GSHQISLDNPDyQQDFFPK GSHQISLDNPDyQQDFFPK GSHQISLDNPDyQQDFFPK GSHQISLDNPDyQQDFFPK GSHQISLDNPDyQQDFFPK GSHQISLDNPDyQQDFFPK GSTAENAEyLR GSTAENAEyLR GSTAENAEyLR GSTAENAEyLR GSTAENAEyLR GSTAENAEyLR GSTAENAEyLR GSTAENAEyLR GSTAENAEyLR Run TGFa20_2F TGFa20_2R AREG20_2F AREG20_2R AREG20_2R_reip AREG20_3R EGF20_2F EGF20_2R T0_R1V4 TGFa20_2F TGFa20_2R AREG20_2F AREG20_2R AREG20_2R_reip AREG20_3R EGF20_2F EGF20_2R T0_R1V4 TGFa20_2F TGFa20_2R AREG20_2F AREG20_2R AREG20_2R_reip AREG20_3R EGF20_2F EGF20_2R T0_R1V4 TGFa20_2F TGFa20_2R AREG20_2F AREG20_2R AREG20_2R_reip AREG20_3R EGF20_2F EGF20_2R T0_R1V4 Cell$# 114 2430607.13 1941190.6 2768161.54 1913155.23 1913155.23 2047901.77 2242008.58 2377156.53 1751948.56 2430607.13 1941190.6 2768161.54 1913155.23 1913155.23 2047901.77 2242008.58 2377156.53 1751948.56 2430607.13 1941190.6 2768161.54 1913155.23 1913155.23 2047901.77 2242008.58 2377156.53 1751948.56 2430607.13 1941190.6 2768161.54 1913155.23 1913155.23 2047901.77 2242008.58 2377156.53 1751948.56 115 2558364.79 2351055.99 2817372.54 2275446.07 2275446.07 2210061.58 2539886.44 2695250.91 1815076.03 2558364.79 2351055.99 2817372.54 2275446.07 2275446.07 2210061.58 2539886.44 2695250.91 1815076.03 2558364.79 2351055.99 2817372.54 2275446.07 2275446.07 2210061.58 2539886.44 2695250.91 1815076.03 2558364.79 2351055.99 2817372.54 2275446.07 2275446.07 2210061.58 2539886.44 2695250.91 1815076.03 116 2832764.7 2698966.46 2487789.83 2658963.53 2658963.53 2435380.15 2602290.09 2644552.07 1706013.03 2832764.7 2698966.46 2487789.83 2658963.53 2658963.53 2435380.15 2602290.09 2644552.07 1706013.03 2832764.7 2698966.46 2487789.83 2658963.53 2658963.53 2435380.15 2602290.09 2644552.07 1706013.03 2832764.7 2698966.46 2487789.83 2658963.53 2658963.53 2435380.15 2602290.09 2644552.07 1706013.03 117 2768050.32 2604745.79 2380362.89 2348436.39 2348436.39 2059384.46 2623107.06 2578791.28 1751444.42 2768050.32 2604745.79 2380362.89 2348436.39 2348436.39 2059384.46 2623107.06 2578791.28 1751444.42 2768050.32 2604745.79 2380362.89 2348436.39 2348436.39 2059384.46 2623107.06 2578791.28 1751444.42 2768050.32 2604745.79 2380362.89 2348436.39 2348436.39 2059384.46 2623107.06 2578791.28 1751444.42 21344.2006 33493.3645 36981.1808 4345.1966 11846.532 12982.1755 8345.26201 11852.396 17004.4635 16040.3435 488.544824 33036.6165 39701.6074 22007.573 4523.64518 7524.7676 6940.93871 662.769261 15837.8958 19437.8379 9756.11043 3535.03008 1637.4053 2939.74393 76.4287569 10999.0664 13433.8144 9874.45615 13322.968 17205.3437 13819.481 1642.62021 17254.4678 23841.9195 13645.1077 22158.5554 25327.6974 27034.719 1556.10681 5219.31906 6300.09994 4366.92003 3713.7955 7429.42211 18665.1184 39244.8679 47370.4234 2491.75284 12333.8008 11246.2983 4955.08665 10853.7624 22507.1083 23903.5619 597.902117 29318.3264 35731.3856 14900.3957 2889.32199 4982.48679 5207.46494 V31.72544 15758.3879 20861.9124 9280.43972 Molecules/cell 114 115 116 117 10514.1149 9204.86294 5114.04608 2646.07682 13075.3284 12136.4765 5129.93348 3788.09358 1580.94069 3825.47443 2566.26597 1156.24547 3226.64548 9933.01567 8463.48161 1196.61321 16252.0205 26592.2661 7419.60848 8998.83249 16633.5024 17297.9906 2525.75965 38014.7049 54416.0767 14720.4985 18972.5226 38881.3986 43340.1329 21851.4891 14438.8432 18081.0515 4642.24087 6467.59227 8484.55176 15796.049 13346.7471 15624.8546 7831.18857 12253.5274 11342.0536 14233.0531 7351.83431 4525.26927 1334.92468 1703.40704 2107.77148 Mean Range Mean Range Mean Range 2797.68542 1394.54489 2797.68542 1394.54489 2797.68542 1394.54489 909.689049 577.037692 909.689049 577.037692 909.689049 577.037692 Final$Quantification$(molecules/cell) 0 TGFα 235.824193 347.24735 AREG 235.824193 347.24735 235.824193 347.24735 16259.9473 1821.10413 4642.24087 912.675703 14024.7882 1771.26081 46215.3908 8200.68588 16846.5106 2126.01203 41110.7657 2229.36714 21422.1433 5170.12284 8209.22049 789.612005 16965.7465 332.244099 1 11794.7216 1280.60675 2403.79308 822.852398 9198.24864 734.767031 12414.3537 567.821746 8414.90689 69.6448773 12344.4004 1002.34674 36369.112 3332.49543 21675.8868 331.686214 35237.2727 1743.90814 17637.8669 1799.97105 10804.2532 1048.14278 16522.4035 482.060041 5 10670.6697 1465.80678 4174.5598 349.085375 7232.85315 291.914447 11790.0495 543.751286 4334.44108 620.645574 14928.9539 695.900782 32524.856 3206.52963 16782.7571 1882.36135 43307.6456 4062.77774 18310.1502 2551.76223 10067.101 786.661327 23205.3351 698.226793 10 5121.98978 7.94370005 2727.79398 161.528007 5094.97587 112.489076 5759.7095 540.390439 5898.17107 1531.25104 7591.51144 239.67713 20548.1936 3293.72582 17901.8316 4256.72386 26181.2082 853.5108 12216.4404 1217.37395 11598.7121 1724.25592 15512.4124 1692.93131 30 3217.0852 571.008376 2345.63777 1189.39231 2288.57461 651.169313 TGFα AREG EGF TGFα AREG EGF TGFα AREG Mean Range Mean Range Mean Range Mean Range Mean Range Mean Range 1715.36773 386.423397 1715.36773 386.423397 1715.36773 386.423397 EGF Mean Range Mean Range Mean Range EGF 134 3.7 Bibliography 1. Mann, M., Kulak, N. A., Nagaraj, N. & Cox, J. The Coming Age of Complete, Accurate, and Ubiquitous Proteomes. Molecular Cell 49, 583–590 (2013). 2. Liang, S. et al. Quantitative proteomics for cancer biomarker discovery. Comb. Chem. High Throughput Screen. 15, 221–231 (2012). 3. Wasinger, V. C., Zeng, M. & Yau, Y. Current Status and Advances in Quantitative Proteomic Mass Spectrometry. International Journal of Proteomics 2013, 1–12 (2013). 4. Zhang, X., Gureasko, J., Shen, K., Cole, P. A. & Kuriyan, J. An Allosteric Mechanism for Activation of the Kinase Domain of Epidermal Growth Factor Receptor. Cell 125, 1137–1149 (2006). 5. Naegle, K. M., Welsch, R. E., Yaffe, M. B., White, F. M. & Lauffenburger, D. A. MCAM: Multiple Clustering Analysis Methodology for Deriving Hypotheses and Insights from High-Throughput Proteomic Datasets. PLoS Comput Biol 7, e1002119 (2011). 6. Wolf-Yadlin, A. et al. Effects of HER2 overexpression on cell signaling networks governing proliferation and migration. Mol Syst Biol 2, (2006). 7. Gerber, S. A., Rush, J., Stemman, O., Kirschner, M. W. & Gygi, S. P. Absolute quantification of proteins and phosphoproteins from cell lysates by tandem MS. Proc. Natl. Acad. Sci. U.S.A. 100, 6940–6945 (2003). 8. Zhang, Y. & White, F. M. Quantitative Proteomic Analysisof PhosphotyrosineMediated Cellular Signaling Networks. Methods in Molecular Biology 1–10 (2007). 9. Voldborg, B. R., Damstrup, L., Spang-Thomsen, M. & Poulsen, H. S. Epidermal growth factor receptor (EGFR) and EGFR mutations, function and possible role in clinical trials. 1–10 (1997). 10. Lee, H. C. et al. Activation of Epidermal Growth Factor Receptor and Its Downstream Signaling Pathway by Nitric Oxide in Response to Ionizing Radiation. Mol. Cancer Res. 6, 996–1002 (2008). 11. Nitta, M. et al. Targeting EGFR Induced Oxidative Stress by PARP1 Inhibition in Glioblastoma Therapy. PLoS ONE 5, e10767 (2010). 135 12. Roskoski, R. ERK1/2 MAP kinases: structure, function, and regulation. Pharmacol. Res. 66, 105–143 (2012). 13. Oksvold, M. P. et al. Serine mutations that abrogate ligand-induced ubiquitination and internalization of the EGF receptor do not affect c-Cbl association with the receptor. Oncogene 22, 8509–8518 (2003). 14. Roepstorff, K. et al. Differential Effects of EGFR Ligands on Endocytic Sorting of the Receptor. Traffic 10, 1115–1127 (2009). 15. Chung, E., Graves-Deal, R., Franklin, J. L. & Coffey, R. J. Differential effects of amphiregulin and TGF-α on the morphology of MDCK cells. Experimental Cell Research 309, 149–160 (2005). 16. Wilson, K. J. et al. EGFR ligands exhibit functional differences in models of paracrine and autocrine signaling. Growth Factors 30, 107–116 (2012). 17. Alvarado, D., Klein, D. E. & Lemmon, M. A. Structural Basis for Negative Cooperativity in Growth Factor Binding to an EGF Receptor. Cell 142, 568–579 (2010). 18. Guo, L. et al. Studies of ligand-induced site-specific phosphorylation of epidermal growth factor receptor. J. Am. Soc. Mass Spectrom. 14, 1022–1031 (2003). 19. Jones, R. B., Gordus, A., Krall, J. A. & MacBeath, G. A quantitative protein interaction network for the ErbB receptors using protein microarrays. Nature 439, 168–174 (2005). 20. Tinti, M. et al. The SH2 domain interaction landscape. Cell Rep 3, 1293–1305 (2013). 21. Arneja, A., Johnson, H., Gabrovsek, L., Lauffenburger, D. A. & White, F. M. Qualitatively Different T Cell Phenotypic Responses to IL-2 versus IL-15 Are Unified by Identical Dependences on Receptor Signal Strength and Duration. The Journal of Immunology 192, 123–135 (2013). 136 4 EGFR Signaling Pathway Downstream of Ligands 4.1 Introduction EGFR is of particular interest to cancer researchers, because it sits perched atop a regulatory network that ultimately controls numerous phenotypes often co-opted in cancer. With such influence over key cellular phenotypes, it is not surprising that many cancers (notably breast and lung) harbor overexpressions or mutations of EGFR or one of its cognate family members (typically HER2)1,2. EGFR is the canonical member of the receptor tyrosine kinase (RTK) class, but the EGFR family has since grown to include three other family members (HER2-4), which collectively recognize a variety of small protein ligands3. Each of these ligands binds their cognate receptor in a similar fashion, yet induce different phenotypic responses from cells4,5. In the case of HER-family members, ligand binding stabilizes an “open” configuration of the extracellular domain facilitating dimerization mediated by interactions between the participating receptors6,7. Close proximity of two kinase domains permits transphosphorylation of key tyrosine residues6,7, which complete binding motifs for proteins containing SH2- or PTB-domains5,8. The first class of proteins known to bind phosphorylated EGFR-family members has no intrinsic enzymatic activity. Instead they consist of various binding domains (SH2, PTB, WW, PH, etc.) as well as their own sites of phosphorylation allowing them to serve as adaptors that provide indirect receptor binding9-13. The second class of proteins is intracellular kinases, which are also recruited through binding domains. Once recruited into the proximity of their target proteins they catalyze their own set of phosphorylation events. The third class is intracellular phosphatases, which are responsible for reversing phosphorylation6. Together, the balance and abundance of these recruited proteins help to decode and propagate signals that will ultimately lead to phenotypic changes in the cell. The exact mechanisms for this decoding is yet unknown. One hypothesis is that ligands must elicit structural changes that affect the phosphorylation efficiency of particular sites leading to a bias in recruited proteins. Some evidence exists for ligand-specific structural differences in receptor dimers; Sheck et. al report alterations in the juxtamembrane portion of EGFR dimers between TGFα- and EGF-treated conditions14. In addition, 137 Alvarado et. al show that ligand affinity biases the drosophila EGFR-cognate towards singly- or doubly-bound receptor dimers alter its structure15, with the implicit assumption that these differences will contribute to altered downstream signaling. In this work, the authors also mention that human EGFR dimer asymmetries have not been observed15. Site-specific phosphorylation is a natural modulation mechanism to explain the differences in information flow across the cellular membrane. Previous studies have shown ligand-specific changes in phosphorylation following stimulation, which were used to explain differential recruitment of SH2- and PTB-domains4. Roepstorff et al. argue that increased Y1045 phosphorylation drives increased c-Cbl recruitment following EGF stimulation leading to elevated rates of receptor degradation. At a molecular level, the parameters that are available for modulation are: rate of phosphorylation, affinity for the completed motif, and protein abundance. At low motif abundance, high affinity/high abundance adaptor proteins are more capable of binding. It is not until the motif abundance increases that the low affinity/low abundance binding proteins will be able dock in appreciably quantities. This dynamic will play out at each docking site on the receptor as well as secondary sites on recruited proteins. To better understand how these signals are conducted in the context of EGFR signaling, we have chosen to focus on ligands known to bind specifically to EGFR and not to other HER-family members. In this case, receptor phosphorylation will be limited to EGFR allowing levels to be modulated solely by ligand dose and identity. This leads to two hypotheses about patterns of receptor phosphorylation. The first is a pattern modulation scheme, where the ligand identity and dose information is encoded in a specific pattern of phosphorylation levels at sites on the receptor. This “barcode” is then read by the appropriate subset of binding proteins initiating the proper downstream programs. The second is an amplitude modulation scheme where the relative level of specific phosphorylation sites is roughly constant, but the overall magnitude is scaled to convey information. Under this scheme a level of control is lost as all sites move together. In order to decide between these mechanisms and make quantitative predictions about downstream phenotypes, we have employed site-specific absolute quantification [cite: MARQUIS] to investigate phosphorylation levels at specific sites on EGFR, a collection 138 of adaptor proteins, as well as a selection of downstream activation signals including the ERKs. 4.2 Methods 4.2.1 Migration Assay: Cells were serum starved for 24 hours prior to trypsinization and rinsing with SFM. 50µl of SFM containing the desired inhibitor concentration was added to the upper reservoir of the migration chamber.1e5 cells in 50µl of SFM were added to the upper reservoir of the migration chambers. 100µl of SFM containing the desired dose of inhibitor was added to the bottom reservoir. A further 100µl of Assay Media (SFM + 2x Insulin to 0.4x serum)16 supplemented with desired dose of ligand was added to the lower reservoir of each migration chamber prior to 24 hours incubation. Lower reservoir media was removed and replaced with 95% ethanol for 10 minutes. Upper reservoir media was aspirated halfway through the ethanol fixation. Membranes were washed with PBS, stained with crystal violet for 15 minutes, and washed with MilliQ water. Residual cell material in the upper chamber was gently removed with lint-free paper and membranes were allowed to dry. The intensity of each membrane was quantified by ImageJ. 4.2.2 Proliferation Assay: 1e4 cells in 100µl were plated into each well of a 96-well plate. After 24 hours, they were serum-starved with 100µl of SFM for an additional 24 hours. Media was exchanged for SFM containing BRDU and desired concentration of ligand and allowed to incubate for 24 hours. BRDU incorporation was quantified following manufacturer instructions (Roche, 11647229001). 4.2.3 Mass Spectrometry Mass Spectrometry Sample Preparation: MCF10A cells were serum-starved 24 hours prior to stimulation with indicated dose of EGFR ligand (purchased from Peprotech) for 1, 5, 10, 30, 120 or 480 minutes. Media was aspirated and cells were lysed with 1ml of cold 8M urea with 1mM activated sodium orthovanadate. Lysate was frozen at -80 until further processing. Protein yield was quantified by BCA assay (Pierce) in order to ensure equal loading. Two technical replicates (250µl) from each of two biological replicates were prepared. Standard peptides were added to samples as in Supplementary Table XX. 139 Samples were reduced with 10µl 10mM DTT in 100 mM ammonium acetate pH 8.9 for one hour at 56 C. Samples were alkylated with 75µl of 55mM iodoacetamide in 100 mM ammonium acetate pH 8.9 for 1 hour at room temperature. 1mL of 100 mM ammonium acetate and 10µg of sequencing grade trypsin was added prior to 16 hours incubation at room temperature. Lysates were acidified with 125µl of trifluoroacetic acid (TFA) and protein was purified with C18 spin columns (ProteaBio, SP-150). Samples were lyophilized and subsequently labeled with iTRAQ 8plex (AbSciex) per manufacturer’s directions. iTRAQ Channel Sample Standards (nominal) 113 Condition 1, BioRep 1, TechRep 1 3 pmol 114 Condition 2, BioRep 1, TechRep 1 1 pmol 115 Condition 1, BioRep 2, TechRep 1 300 fmol 116 Condition 2, BioRep 2, TechRep 1 100 fmol 117 Condition 1, BioRep 1, TechRep 2 30 fmol 118 Condition 2, BioRep 1, TechRep 2 10 fmol 119 Condition 1, BioRep 2, TechRep 2 3 fmol Table 4-­‐1: iTRAQ labeling scheme. Two biological and two technical replicates of each sample were included in each analysis along with the indicated nominal doses of heavy-­‐labeled standard peptide. Standard Peptide Preparation and Quality Assurance: Standard peptides were synthesized at the Koch Institute Biopolymers facility and amino acid analysis (AAA) was performed to measure concentration by AAA Service Laboratories (Damascus, OR). Synthetic peptides were analyzed by LC MS/MS to confirm sequence and purity. Immunoprecipitation: 70ul protein G agarose beads (calbiochem IP08) were rinsed in 400ul IP Buffer (100mM tris, 0.3% NP-40, pH 7.4) prior to an 8 hour incubation with a cocktail of three phosphotyrosine-specific antibodies (12µg 4G10 (Millipore), 12ug PT66 (Sigma), and 12ugPY100 (CST)) in 200ml IP Buffer. Antibody cocktail was removed and beads were rinsed with 400ul of IP Buffer. Labeled samples were resuspended in 140 150ul iTRAQ IP Buffer (100mM Tris, 1% NP-40, pH 7.4) + 300ul milliQ water and pH was adjusted to 7.4 (with 0.5M Tris HCl pH 8.5) prior to addition to prepared beads and 16 hour incubation. Supernatant was removed and beads were rinsed 3 times with 400ul Rinse Buffer (100mM Tris HCl, pH 7.4). Peptides were eluted in 70ul of Elution Buffer (100mM glycine, pH 2) for 30 minutes at room temperature. Immobilized Metal Affinity Chromatography (IMAC) Purification: A fused silica capillary (FSC) column (200µm ID x 10cm length) was packed with POROS 20MC beads (Applied Biosystems cat.no. 1-5429-06). Column was rinsed with 100 mM EDTA pH 8.9 in ultrapure water to remove residual iron and then rinsed with ultrapure water to remove EDTA before fresh iron was deposited by rinsing with 100 mM FeCl3 in ultrapure water. Excess iron was removed with 0.1% acetic acid prior to loading the IP elution. The column was washed with 25% MeCN, 1% HOAc, 100 mM NaCl in ultrapure water to remove non-specifically bound peptides and then rinsed with 0.1% HOAc. Peptides were eluted with 50µl 250mM NaH2PO4 in ultrapure water and collected in an autosampler vial. Eluent was acidified with 2µl of 10% TFA prior to loading. Liquid Chromatography Mass Spectrometry: Acidified eluent was loaded onto an Acclaim PepMap 100 precolumn (Thermo Scientific) using an EASY-nLC 1000 (Thermo Scientific). Peptides were analyzed on a one hour gradient from 100% A (0.1% Formic Acid) to 100% B (0.1% Formic Acid, 80% Acetonitrile) spraying through a 50cm analytical column (Manufacturer, PN) packed with 3µm beads (Manufacturer, PN). Absolute Quantification (Full Scan): Analyses were performed on a Thermo QExactive. Endogenous and heavy-labeled versions of each peptide were targeted for MS2 analysis during their expected elution times. In the MS1 scan, the total amount of endogenous peptide was calculated based on the fraction of MS1 integrated intensity when compared to the standard peptide. This amount was then apportioned across the contributing samples based on relative iTRAQ intensity. A single point measurement where the MS2 scan used to determine the iTRAQ intensity ratios was chosen based on maximal MS1 intensity to assure maximal signal to noise ratio. 141 4.2.3.1 Partial Least Squares Regression (PLSR) PLSR was performed using the plsregress function provided in the MATLAB software package. 4.3 Results 4.3.1 MCF10A 24-­‐hour Phenotypes To understand the different proliferation-inducing capacities of three EGFR ligands (TGFα, AREG, and EGF), MCF10A cells were subjected to 24-hour BRDU incorporation (Figure 4-­‐1a). A wide range of ligand doses were used to expore the full range of responses possible for the system, but saturation was observed for all ligands at 5nM. At doses below this threshold, EGF shows a slight increase over TGFα, but AREG lags significantly behind. Moving forward, we selected 1nM as a sub-saturated ligand dose and 20nM as a super-saturated ligand dose. MCF10A cells were subjected to a transwell migration assay in the presence of the subsaturating and super-saturating concentrations of each EGFR ligand (Figure 4-­‐1b). Surprisingly, the responses were more varied than the proliferation data would suggest. First, all three ligands were able to meet or exceed the measured migration of the positive control (growth media), which was not the case for proliferation. TGFα showed identical migratory capacity at both the low and high dose and AREG only induced significant migration at the high dose. Interestingly, EGF induced a greater migratory signal at the low dose than at the high dose, suggesting a saturational downregulation of signaling events not present in the responses to other ligands. 142 1.2" a)# BRDU"Signal" 1" 0.8" TGFα" 0.6" AREG" 0.4" EGF" 0.2" b)# " CM " 20 "n M " 10 0" nM " 50 0" nM " 5" nM " 1" nM SF M " 0. 05 "n M " 0. 25 "n M " 0" 14" Transwell"Migra5on" 12" 10" TGFα" 8" AREG" 6" EGF" 4" 2" 0" AM" 1nM" 20nM" CM" Figure 4-­‐1: Ligand-­‐Specific Phenotypic Response. a) 24-­‐hour BRDU incorporation. b) 24-­‐hour transwell migration 4.3.2 LC-­‐MS Absolute Quantification EGFR-ligand treatment induces tyrosine phosphorylation of the receptor and many downstream proteins. Absolute site-specific phosphorylation was measured sites on EGFR, key adaptor proteins such as Gab1 and Shc, as well as downstream network nodes such as the ERKs. This data was gathered using the MARQUIS method (introduced in Chapter 3) with heavy-labeled synthetic peptide standards produced for each phosphorylation site of interest (Figure 4-­‐2). 143 Sample%1% Sample%3% Sample%2% Sample%5% Sample%4% Sample%7% Sample%6% Sample%8% Lyse%cells%and%add%heavy-labeled%phosphopep1de%library% Reduce,%alkylate,%digest%proteins%and%purify%pep1des%% iTRAQ%label,%combine% Phosphotyrosine%immunoprecipita1on%and%IMAC%enrichment% Targeted%LC/MSMS%analysis% P% MS% P% P% MS/MS% Figure 4-­‐2: Full-­‐Scan MARQUIS method for absolute quantification. Eight samples representing unique cellular stimulation conditions are lysed and heavy-­‐labeled peptide library is included prior to any sample processing. Samples are reduced, alkylated, and trypsin digested prior to C18 peptide purification. Samples are iTRAQ labeled, subjected to phosphotyorisine immunoprecipitation, IMAC enrichment, and targeted LC/MSMS. 4.3.3 EGFR Pattern of Phosphorylation Previous observations indicated an interesting phenomenon regarding the C-terminal phosphorylation site of EGFR: the relative phosphorylation levels between the sites did not appreciably change between ligands upon treatment with super-saturating concentrations. These analyses were repeated with the full-scan MARQUIS method for EGFR phosphorylation sites Tyr-1045, Tyr-1068, Tyr-1148, Tyr-1173, as well as the 144 additional Tyr-1045/Tyr-1068 dual phosphorylated peptide (Figure 4-­‐3a). Confirming previous observations, the pattern present at five minutes post ligand treatment (Figure 4-­‐3b) was maintained for all time points up to 30 minutes. In this study time points at two hours and eight hours post-treatment were included to capture longer dynamics of the system. The ranked order of phosphorylation intensity remained constant throughout all time points measured (Appendix Figure 4-­‐7). These data suggest that at saturating doses the identity of the ligand plays a smaller role in the pattern of phosphorylation. Instead, the affinity of the ligand for the receptor serves to scale the magnitude of the signal response. a)# b)# 60" Molecules/cell"(x103)" 6" Median"Normalized"Moledules"/"cell" 70" 50" 40" 30" 20" 10" 5" EGFR"pY1045" 4" EGFR"pY1068" 3" EGFR"pY1148" EGFR"pY1173" 2" EGFR"pY1045pY1068" 1" 0" 0" TGFα"20nM" AREG"20nM" TGFα"20nM" EGF"20nM" AREG"20nM" EGF"20nM" c)# 45" Molecules"/"cell"(x103)" 40" 35" 30" EGFR"pY1045" 25" EGFR"pY1068" 20" EGFR"pY1148" 15" EGFR"pY1173" 10" EGFR"pY1045pY1068" 5" 0" 0" 5" 30" TGFα"1nM" 5" 30" AREG"1nM" 5" 30" EGF"1nM" Figure 4-­‐3: Ligand-­‐Specific EGFR Phosphorylation. a) 20nM ligand treatment for five minutes. b) Median normalized 20nM ligand treatment for five minutes. c) 1nM ligand treatment for five or 30 minutes. In contrast, treatment with a sub-saturating (1nM) ligand treatment induced significant pattern variation (Figure 4-­‐3c). For a five-minute treatment, the phosphorylation pattern closely mirrored that of the super-saturating ligand dose. However, at 30 minutes poststimulation a very different pattern emerged for all ligands. Tyr-1148 was no longer the 145 most abundant site of phosphorylation as Tyr-1068 met or exceeded it for all three ligands. This alteration in phosphorylation pattern indicates differential kinase and/or phosphatase activity at the sub-saturating dose and implies differential recruitment of adaptor proteins. 4.3.4 Adaptor Protein Patterns of Phosphorylation At a super-saturating dose of ligand, differential phenotypes were still observed (Figure 4-­‐1), but those differences were not readily apparent in the phosphorylation pattern on the receptor itself. Even if the receptor is presenting nearly identical patterns, the magnitude of those patterns appear to have a significant effect on the downstream signaling network. To test this hypothesis, site-specific phosphorylation on adaptor proteins implicated in EGFR binding and signal transduction cascades were measured. The patterns of two such adaptor proteins, Shc (Figure 4-­‐4a) and Gab1 (Figure 4-­‐4c), indicate that similar patterns of EGFR phosphorylation following high dose stimulation can produce different patterns of adaptor protein phosphorylation. In these cases, AREG presents a lower abundance version of the EGFR pattern and elicits very different stoichiometry of adaptor protein phosphorylation. At a sub-saturating dose, the EGFR phosphorylation pattern is not constant. Not surprisingly, these differences also correspond to altered phosphorylation patterns on Shc (Figure 4-­‐4b) and Gab1 (Figure 4-­‐4d). What is more interesting is the comparison between the patterns elicited super-saturating and sub-saturating dose. In the case of Shc the super-saturating dose and the sub-saturating dose patterns are quite similar with all signals being cut roughly in half. Gab1, on the other hand, has almost all of its phosphorylation sites decreased by 5-10 fold. Perhaps this is due to a higher affinity of the Shc SH2-domains for EGFR phosphorylation sites. Under this scenario, Shc would bind more efficiently even when the concentration of EGFR phosphorylated motifs is lower. Only when EGFR phosphorylation levels are high enough do motifs become available for a lower affinity binder. 146 a)# b)# 150" 100" Shc"pY240" Shc"pY239pY240" Shc"pY317" 50" Molecules"/"cell"(x103)" Molecules"/"cell"(x103)" 150" 0" Shc"pY240" Shc"pY239pY240" Shc"pY317" 50" 0" TGFα"20nM" AREG"20nM" EGF"20nM" c)# TGFα"1nM" AREG"1nM" EGF"1nM" d)# 140" 140" 120" 120" 100" Gab1"pY259" 80" Gab1"pY373" 60" Gab1"pY637" Gab1"pY659" 40" Molecules"/"cell"(x103)" Molecules"/"cell"(x103)" 100" 100" Gab1"pY259" 80" Gab1"pY373" 60" Gab1"pY637" 40" Gab1"pY659" 20" 20" 0" 0" TGFα"20nM" AREG"20nM" EGF"20nM" TGFα"1nM" AREG"1nM" EGF"1nM" Figure 4-­‐4: Ligand-­‐Specific Adaptor Phosphorylation. Site-­‐specific Shc phosphorylation following 20nM (a) or 1nM (b) ligand treatment for ten minutes. Site-­‐specific Gab1 phosphorylation following 20nM (c) or 1nM (d) ligand treatment for 10 minutes. 4.3.5 PLSR Predicts TGFα Phenotypic Responses from AREG and EGF Data Using the phosphorylation data as inputs and the phenotypic response data as outputs, we constructed from AREG and EGF treatment data that was then used to predict the phenotypic responses of cells treated with TGFα. A one-component PLSR model was able to explain 66% of the output variance in the AREG and EGF data (Figure 4-­‐5a). This model was then fed the phosphorylation data of TGFα and asked to predict the phenotypic output (Figure 4-­‐5b). Despite TGFα phenotypic behavior profile, this simple model was able to predict outputs with comparable accuracy to the training set data. 147 a)# b)# 0.2" 80" 0.15" 60" 0.1" 40" 0.05" 20" 0" 0" 0" 1" 2" Number"of"PLSR"Components" 3" 1.5" Predicted" 0.25" 100" Total"Squared"Predic9on"Error" Percent"Variance"Explained"in"Y" 120" 1" Migra9on" Prolifera9on" 0.5" FIt"Migra9on" Fit"Prolifera9on" 0" 0" 0.5" 1" 1.5" Actual" Figure 4-­‐5: Partial Least Squares Regression Predicts TGFα-­‐induced Migration and Proliferation. a) A one-­‐component PLSR model is able to explain 66% of the observed variance (blue) and minimize the observed prediction error (red). b) Predicted vs. actual values for training set (gray) and test set (blue, red). Analogous models to predict EGF and AREG phenotypic responses were constructed to various degrees of success (Appendix Figure 4-­‐8). A model for EGF phenotypic response was comparably effective at producing predictions; a one-component model explained 68% of the output variation and produced errors in line with those of the TGFα predictions. The difference came when the AREG prediction model was examined. The one-component model no longer provided the optimal prediction error and the magnitude of that error was 3-4 fold higher than for other ligands. Upon closer inspection, the low dose AREG migration prediction is way out of line and drives the larger observed error (Appendix Figure 4-­‐8b,d). Including this point in the model building stage helps to explore a different corner of the output space and increase the model fidelity in that region. When it is excluded, the resulting model has trouble extrapolating that far beyond the cases in the training set. 4.3.6 PLSR to Model Phenotypes Predicts Stronger ERK2 pY187 Contribution to Proliferation A PLSR model built from the complete data set (Figure 4-­‐6a) shows that ERK2 pTyr-187 was ranked more strongly in proliferation than migration (Appendix Figure 4-­‐9a). This site was also one of the most abundant phosphorylation sites observed in the absolute quantification data set (Appendix Figure 4-­‐9b). It is worth noting that other sites, such as IRS2 Tyr-675 and Shc Tyr-239/240, were also strongly ranked in both of these categories. The ERK2 Tyr-187 site was chosen, because a very clean tool compound (UO126) exists that specifically inhibits the upstream kinase (MEK) responsible for 148 phosphorylating this tyrosine residue. MEK is known to specifically phosphorylate ERK1/217,18 making treatment with UO126 a very directed modification to the cellular state. Figure 4-­‐6b shows the decrease in ERK phosphorylation when EGF stimulation follows pretreatment with UO126. Doses as low as 1µM show significant decreases in the levels of ERK phosphorylation. This suggests that UO126 would have a more pronounced effect on proliferation than on migration. Figure 4-­‐6c shows the proliferation in response to EGF treatment in the presence of increasing doses of UO126. The proliferative activity of the cells drops rapidly and is nearly gone by 5 µM. In contrast, Figure 4-­‐6d shows the transwell migration under the same conditions. Migration levels remain strong well beyond 5 µM UO126 and only disappear closer to 10 µM. a)# b)# 1.5" Migra3on" 0.5" 10μM# 5μM# 2μM# 1# MAPK#pTpY# Prolifera3on" Ac.n# 0" 0" 0.5" 1" 1.5" Actual" c)# 1.2" d)# 1.4" Prolifera3on" TGFα"20nM" AREG"20nM" 0.4" EGF"20nM" Transwell"Migra3on" 0.8" 0.6" Migra3on" 1.2" 1" BRDU"Signal" 1# 1μM# Predicted" UO126# 1" 100nM# + + + + + + 1# EGF#20nM# 1" 0.8" TGFα"20nM" 0.6" AREG"20nM" 0.4" EGF"20nM" 0.2" 0.2" 0" 0" Unt" 2"μM" 4"μM" 6"μM" U0126" 8"μM" 10"μM" Unt" 2μM" 4μM" 6μM" 8μM" 10μM" U0126" Figure 4-­‐6: Proliferation is more sensitive to UO126 treatment than migration. a) Two-­‐component PLSR model fit to complete data set. b) MAPK phosphorylation following 15 minute pre-­‐treatment with indicated dose of UO126 and five minute treatment with 20nM EGF. c) 24-­‐hour BRDU incorporation: IC90 ~5nM. d) 24-­‐hour transwell migration: IC90 ~9nM. 4.4 Conclusion The method utilized by EGFR to transmit signals about extracellular ligand concentration to intracellular signaling components has been only partially explored. In particular, 149 presentation of ligand induces phosphorylation on key intracellular residues that form docking sites for a host of adaptor proteins that perpetuate the signal and help to decide which signaling pathways are activated. This leads to two mechanisms of how EGFR uses its distinct phosphorylation sites to encode information that ultimately drive measureable phenotypic outputs. Pattern modulation of the intracellular phosphorylation sites was a natural hypothesis, because it explains how stoichiometry between completed binding motifs would instantiate differential downstream signaling. At low doses of ligand stimulation different patterns of phosphorylation are observed. These patterns change throughout time and are ligand-dependent. EGF and TGFα are more similar to each other than they are to AREG. Affinity differences alone are not sufficient to explain these differences; otherwise one might expect a scaling phenomenon that could be overcome by increasing AREG dose. Instead, a structural mechanism is required to explain how the receptor is able to distinguish the identity of the ligand. Scheck et al. have demonstrated such a mechanism between EGF and TGFα where juxtamembrane helices position themselves differently based on the identity of the ligand14. The idea of ligand specific phosphorylation patterns plays out nicely in the low-ligand regime. However, when the ligand dose is increased a similar pattern emerges for all three ligands. What differs is the amplitude of this pattern, and the identity of the ligand appears less important except to determine affinity. Hence, the pattern is more strongly presented with EGF and TGFα as these ligands have a higher affinity5,9. The high ligand regime aligns nicely with a structural hypothesis based on negative cooperativity between ligand-binding sites in an EGFR homodimer. With a lower affinity ligand such as AREG, fewer of the receptor dimers would be doubly-bound with ligand15 producing less efficient signaling. Pattern modulation at the receptor level with low ligand doses leads very naturally to differential adaptor protein binding and subsequent phosphorylation. Yet, amplitude modulation at the receptor level induced by high ligand doses is still able to produce distinct patterns of phosphorylation at the adaptor protein level. Coupled with differential binding affinity of the SH2 domains of adaptor proteins for phosphorylated EGFR has been reported19, the hypothesis emerges that amplitude modulation of an identical pattern 150 may still be able to drive the observed pattern modulations of adaptor proteins. A stronger affinity SH2 domain will have a stronger binding representation at a lower dose making it more capable of localization to the receptor. This competitive binding explanation provides another axis of control in the system. Decoding the information stored in site-specific phosphorylation is the next big challenge. In this work, we have shown that testable predictions can be made from these data sets, specifically a key site (ERK pY187) was predicted to contribute more strongly to proliferation than migration, which was confirmed using a MEK-specific inhibitor to bring down phosphorylation. Such conclusions are just the beginning, as absolute sitespecific absolute phosphorylation data opens the doors for new data-driven and mechanistic modeling approaches. 151 4.5 Appendix Figures a)# b)# 70" 70" TGFα"20nM" 50" 40" 30" 20" 10" 40" 30" 20" 0" 0" 1" 70" 5" 10" 30" 120" 480" EGF"20nM" 0" 1" 5" 10" 30" 120" 480" 70" 70" 60" Molecules/cell"(x103)" 50" 10" 0" c)# AREG"20nM" 60" Molecules/cell"(x103)" Molecules/cell"(x103)" 60" EGFR"pY1045" 69" 50" EGFR"pY1068" 69" 68" 40" EGFR"pY1148" 68" 30" 20" 67" EGFR"pY1173" 67" EGFR"pY1045pY1068" 66" 10" 66" 0" 0" 1" 5" 10" 30" 120" 480" 65" 0" 1" 5" 10" 30" 120" 480" Figure 4-­‐7: Similar EGFR Phosphorylation Pattern at High Ligand Dose. Absolute amount of EGFR phosphorylation at indicated time points (minutes) following stimulation. a) TGFα 20nM. b) AREG 20nM. c) EGF 20nM. 152 a)# b)# 1.5" 1" Migra3on" Prolifera3on" 0.5" Predicted" 1" Migra3on" Prolifera3on" 0.5" FIt"Migra3on" FIt"Migra3on" Fit"Prolifera3on" Fit"Prolifera3on" 0" 0" 0.5" 1.5" 0" Actual" 0.35" 100" 0.3" 0.25" 80" 0.2" 60" 0.15" 40" 0.1" 20" 0.05" 0" 0" 0" 1" 2" Number"of"PLSR"Components" 3" 0.5" 1" 1.5" Actual" d)# 120" 120" Total"Squared"Predic3on"Error" c)# Percent"Variance"Explained"in"Y" 1" Percent"Variance"Explained"in"Y" 0" 1.25" 100" 1.2" 80" 1.15" 60" 1.1" 40" 1.05" 20" 0" Total"Squared"Predic3on"Error" Predicted" 1.5" 1" 0" 1" 2" 3" Number"of"PLSR"Components" Figure 4-­‐8: Phenotypic prediction PLSR models. One-­‐component PLSR models were built to simultaneously predict observed migration and proliferation from phosphotyrosine data a) Model built from AREG and TGFα used to predict EGF. b) Model built from EGF and TGFα used to predict AREG. c) Explained variance and prediction error for EGF model. d) Explained variance and prediction error for EGF model. 1000$ ProliferaDon$$ Rank$Bias$ a)# 500$ 0$ !500$ 4$ 4000000$ 3$ 3000000$ 2$ 2000000$ 1$ 1000000$ 0$ 0$ EG EG FR$ F pY EG E R$p 845 FR GF Y99 $ $p R$ 8m Y1 pY $ 04 10 EG 5pY 45$ FR 10 EG $pY 68$ F 1 EG E R$p 068 FR GF Y1 $ $p R$ 08 S1 pY 6$ 14 11 EG 2pY 48$ FR 11 $ 4 ER ER pY1 8$ K1 K1 17 $p $p 3$ T2 Y2 0 0 ER ER 2pY 4$ K2 K2 20 $p $p 4$ T1 Y1 8 8 Ga 5pY 7$ Ga b1$ 187 b1 pY2 $ $p 5 Ga Y37 9$ b1 3m Ga $pY $ b1 63 HE $pY 7$ R2 65 HE $p 9$ R2 Y8 HE $pY 77$ R 10 IR 2$pY 23$ S2 1 $p 24 Y 8 IR 675 $ S2 m IR $pY $$ S2 74 $p 2 Sh Y82 $ p2 3$ $ $ Sr pY6 Sh Sh c$pY 2$ c$p c$ 41 Y2 pY2 8$ 39 40 pY m $ Sh 240 c$p m Y3 $ 17 $ b)# Total$PhosphorylaDon$ (nmol)$ EG EG FR$ F pY EG E R$p 84 FR GF Y9 5$ $p R$ 98 Y1 pY m 04 10 $ EG 5pY 45$ FR 10 EG $pY 68$ 1 EG E FR$p 068 FR GF Y1 $ $p R$ 08 S1 pY 6 14 11 $ EG 2pY 48$ FR 11 $ 4 ER ER pY1 8$ K1 K1 17 $p $p 3$ T2 Y2 0 0 ER ER 2pY 4$ K2 K2 20 $p $p 4$ T1 Y1 8 8 Ga 5pY 7$ Ga b1$ 187 b1 pY2 $ $p 5 Ga Y3 9$ b1 73m Ga $pY $ b1 63 HE $pY 7$ R 6 HE 2$p 59$ R2 Y8 HE $pY 77$ R 10 IR 2$pY 23 S2 1 $ $p 24 Y 8 IR 675 $ S2 m IR $pY $$ S2 74 $p 2 Sh Y82 $ p2 3$ $ Sr $pY Sh Sh c$p 62$ c$p c$ Y4 Y2 pY2 18$ 39 40 pY m Sh 240 $ c$p m Y3 $ 17 $ !1000$ 1000$ 800$ 600$ 400$ 200$ 0$ !200$ !400$ !600$ !800$ !1000$ Figure 4-­‐9: ERK2 pY187 Contributes More Strongly to Predicting Proliferation than Migration. a) Proliferation rank bias for each phosphorylation site in the complete PLSR model. b) Total amount of phosphorylation observed across all measured conditions. 153 4.6 Appendix Tables Ligand Unt. AREG AREG AREG AREG AREG AREG AREG AREG AREG AREG AREG AREG EGF EGF EGF EGF EGF EGF EGF EGF EGF EGF EGF EGF TGFa TGFa TGFa TGFa TGFa TGFa TGFa TGFa TGFa TGFa TGFa TGFa 154 EGFR%pY845 EGFR%pY998m EGFR%pY1045 EGFR%pY1045pY1068 EGFR%pY1068 EGFR%pY1086 EGFR%pY1148 EGFR%pS1142pY1148 EGFR%pY1173 ERK1%pY204 Dose%(nM) Time%(min.) Mean Range Mean Range Mean Range Mean Range Mean Range Mean Range Mean Range Mean Range Mean Range Mean Range 0 0 1.02E+06 4.13E+04 2.13E+02 4.72E+01 3.70E+02 2.28E+02 4.37E+02 1.60E+02 4.33E+02 5.45E+01 1.33E+03 3.74E+02 1.06E+03 3.69E+02 1.37E+03 5.38E+02 9.79E+02 1.28E+02 4.45E+03 1.23E+03 1 1 6.45E+05 3.68E+04 7.96E+02 9.18E+01 1.93E+03 2.75E+02 8.69E+03 2.18E+01 8.47E+03 3.02E+03 8.92E+03 5.90E+02 4.53E+03 2.04E+03 4.63E+03 1.17E+02 3.03E+02 4.00E+01 1 5 1.31E+06 1.53E+05 8.14E+02 3.67E+02 5.76E+02 1.37E+01 2.40E+03 3.04E+02 3.54E+03 3.56E+02 6.80E+03 8.78E+02 1.25E+04 1.05E+03 3.55E+03 1.75E+03 4.56E+03 2.81E+02 3.40E+04 6.49E+03 1 10 2.32E+05 1.40E+03 8.92E+02 6.50E+01 1.61E+02 9.93E+00 9.30E+02 4.36E+01 6.99E+03 6.43E+02 7.75E+02 1.52E+01 8.77E+03 6.54E+01 1.36E+03 8.29E+01 2.04E+03 3.11E+02 3.12E+04 1.85E+03 1 30 2.29E+06 2.91E+05 3.67E+02 1.49E+02 1.69E+03 4.68E+02 5.78E+02 6.12E+01 1.54E+03 3.09E+02 5.69E+03 4.58E+02 2.22E+03 5.77E+02 1.83E+03 4.03E+02 1.22E+04 7.81E+02 1 120 2.52E+06 3.09E+05 1.22E+02 3.23E+01 8.27E+02 1.93E+02 9.44E+02 2.44E+02 1.89E+03 1.93E+02 2.43E+02 9.54E+00 3.47E+03 7.05E+00 9.86E+02 2.67E+02 1.93E+03 8.86E+01 4.94E+03 2.68E+03 1 480 1.27E+06 3.76E+04 6.64E+02 8.81E+01 4.75E+02 3.87E+01 4.55E+02 4.61E+01 3.47E+03 2.61E+01 8.20E+03 4.54E+02 3.04E+02 1.90E+02 1.36E+03 1.79E+01 1.17E+04 1.34E+02 20 1 1.07E+06 9.70E+04 7.44E+02 1.01E+02 2.32E+03 1.35E+02 4.93E+03 7.31E+01 6.33E+03 1.33E+03 2.09E+04 5.47E+03 3.29E+03 5.35E+02 5.61E+03 3.21E+02 2.60E+02 3.84E+00 20 5 1.56E+06 2.18E+05 4.49E+02 1.14E+02 1.40E+03 8.17E+01 5.58E+03 2.10E+03 9.66E+03 1.42E+03 9.62E+03 1.11E+03 3.37E+04 7.06E+03 4.16E+03 1.92E+03 4.25E+03 3.05E+02 2.18E+04 2.16E+03 20 10 3.91E+05 7.31E+02 1.51E+03 1.22E+02 2.52E+02 1.27E+01 2.22E+03 8.23E+00 1.14E+04 1.12E+03 1.15E+03 2.26E+01 1.62E+04 1.13E+02 2.04E+03 1.84E+01 3.47E+03 5.54E+02 4.60E+04 6.86E+02 20 30 2.99E+06 5.19E+05 3.17E+02 2.25E+01 1.45E+03 4.65E+02 1.12E+03 3.60E+02 2.62E+03 7.84E+02 6.91E+03 3.93E+02 1.78E+03 2.99E+02 3.18E+03 7.08E+02 1.25E+04 9.71E+02 20 120 2.93E+06 5.86E+05 1.27E+02 2.22E+01 8.69E+02 2.17E+02 1.87E+03 5.40E+02 3.03E+03 4.70E+02 2.94E+02 1.03E+01 7.12E+03 1.38E+03 1.13E+03 4.45E+01 2.15E+03 1.17E+02 4.42E+03 2.31E+03 20 480 1.54E+06 3.84E+04 5.05E+02 8.79E+01 5.30E+02 5.45E+01 6.04E+02 7.68E+00 4.56E+03 9.70E+00 7.50E+03 4.43E+02 2.03E+02 6.71E+01 1.02E+03 8.18E+00 1.14E+04 4.08E+01 1 1 3.75E+05 1.18E+05 1.76E+02 4.73E+01 9.90E+02 2.13E+02 7.78E+03 9.87E+02 5.97E+03 2.48E+02 9.91E+02 1.66E+02 1.92E+03 4.73E+02 6.62E+03 1.58E+03 1 5 6.67E+05 3.02E+04 7.12E+02 6.36E+01 6.83E+02 8.05E+01 1.73E+03 5.17E+02 2.41E+03 6.04E+02 1.32E+03 1.09E+02 8.70E+03 1.07E+03 1.50E+03 2.79E+02 2.94E+03 1.29E+02 3.06E+04 4.58E+03 1 10 1.84E+06 3.85E+05 1.19E+02 1.51E+01 1.21E+03 5.68E+02 2.09E+03 7.74E+01 3.26E+03 1.29E+01 2.11E+03 1.75E+02 1.37E+04 1.07E+03 2.61E+03 8.76E+01 2.27E+03 6.42E+00 3.01E+04 1.13E+03 1 30 8.94E+05 2.42E+05 4.88E+02 7.15E+00 4.58E+03 1.51E+03 1.87E+03 1.76E+02 7.21E+03 1.69E+03 4.26E+03 1.79E+01 2.31E+04 1.25E+03 3.43E+02 7.99E+01 3.77E+03 2.14E+02 2.29E+04 3.13E+03 1 120 2.66E+05 5.34E+03 1.17E+03 4.24E+01 3.33E+02 6.84E+01 1.66E+03 7.28E+01 4.57E+03 5.89E+01 2.36E+03 7.06E+01 6.93E+03 6.87E+02 2.80E+02 7.43E+00 2.13E+03 2.53E+01 8.26E+03 3.39E+02 1 480 1.16E+06 1.56E+04 7.95E+01 7.41E+00 6.51E+02 6.96E+01 3.42E+03 1.93E+02 4.20E+03 9.90E+02 2.67E+03 3.17E+02 1.10E+04 5.36E+02 5.98E+02 2.40E+01 1.37E+03 1.21E+02 1.38E+04 9.88E+02 20 1 3.35E+05 1.25E+05 2.83E+03 9.63E+01 3.32E+03 3.68E+02 7.24E+03 1.19E+03 1.14E+04 1.10E+03 3.85E+04 1.81E+03 3.19E+04 4.01E+03 3.17E+03 1.38E+02 6.33E+03 1.21E+03 6.35E+03 1.86E+02 20 5 4.80E+05 2.23E+05 1.64E+03 9.38E+01 4.56E+03 7.02E+02 1.22E+04 8.50E+02 7.43E+03 1.70E+02 6.65E+03 9.50E+01 5.07E+04 3.25E+03 1.52E+03 8.38E+01 1.04E+04 4.56E+02 3.69E+04 1.79E+03 20 10 7.51E+05 2.66E+05 1.67E+03 4.94E+02 4.23E+02 3.67E+01 4.74E+03 1.02E+03 7.89E+03 2.54E+03 1.25E+03 3.39E+02 3.62E+04 1.77E+03 6.65E+03 1.11E+03 7.20E+03 2.74E+02 2.92E+04 6.41E+02 20 30 2.92E+05 1.31E+05 3.12E+02 1.60E+02 5.88E+02 1.32E+02 2.81E+03 5.72E+02 5.04E+03 2.43E+03 3.67E+03 1.20E+03 2.47E+04 1.15E+03 2.08E+03 1.20E+03 7.53E+03 1.56E+02 1.03E+04 1.54E+03 20 120 8.08E+04 1.23E+04 5.26E+02 1.69E+02 2.99E+02 8.42E+01 1.23E+03 2.20E+01 1.65E+03 3.32E+02 1.17E+04 2.18E+03 1.36E+03 1.54E+02 2.03E+03 2.96E+02 4.78E+03 8.96E+02 20 480 2.01E+05 4.03E+03 2.92E+02 2.46E+01 4.42E+01 3.78E+00 3.86E+03 1.08E+02 3.24E+03 5.10E+01 7.04E+03 2.15E+02 1.02E+04 3.09E+02 1.95E+03 8.49E+00 1.05E+04 5.44E+02 1 1 1.01E+06 3.61E+04 5.52E+02 1.95E+02 1.42E+03 1.27E+02 1.55E+04 6.33E+03 9.31E+03 5.56E+01 1.83E+03 2.36E+02 2.66E+03 8.07E+02 1.26E+04 2.61E+03 1 5 1.29E+06 6.72E+04 1.20E+03 7.37E+01 2.88E+03 5.56E+02 1.77E+03 5.64E+02 3.80E+03 6.54E+02 1.69E+03 1.56E+02 1.46E+04 1.43E+03 2.56E+03 4.36E+02 3.12E+03 5.95E+01 4.84E+04 6.39E+03 1 10 3.42E+06 5.90E+05 2.34E+02 1.21E+02 2.33E+03 1.07E+03 1.39E+03 9.88E+01 2.46E+03 4.02E+01 2.35E+03 1.69E+02 1.62E+04 5.34E+03 3.80E+03 1.24E+03 2.54E+03 2.22E+01 3.00E+04 5.15E+03 1 30 7.83E+05 2.68E+05 5.81E+02 2.83E+00 2.39E+03 4.87E+02 1.20E+03 4.42E+01 8.72E+03 1.78E+03 4.85E+03 4.00E+01 1.52E+04 5.18E+02 4.55E+02 5.75E+01 3.35E+03 2.87E+02 3.27E+04 4.85E+03 1 120 3.30E+05 3.15E+03 8.95E+02 3.14E+01 5.28E+02 6.60E+01 1.38E+03 3.76E+01 4.68E+03 2.92E+01 2.33E+03 1.29E+01 6.72E+03 1.91E+02 3.80E+02 9.18E+01 1.64E+03 9.41EH01 7.85E+03 5.79E+01 1 480 8.81E+05 9.01E+02 1.38E+02 1.12E+01 7.86E+02 9.71E+01 3.82E+03 4.85E+02 1.31E+04 2.52E+03 2.45E+03 2.19E+02 1.27E+04 6.25E+02 9.18E+02 1.33E+02 1.34E+03 1.29E+02 1.73E+04 1.04E+03 20 1 6.20E+05 1.35E+05 2.95E+03 1.01E+02 3.67E+03 4.82E+02 7.51E+03 1.22E+03 1.05E+04 6.21E+02 3.74E+04 1.62E+03 3.33E+04 4.53E+03 3.80E+03 9.48E+01 5.36E+03 1.20E+03 6.48E+03 6.29E+01 20 5 8.44E+05 3.63E+05 1.61E+03 6.66E+01 3.94E+03 3.70E+02 1.27E+04 3.14E+02 7.53E+03 2.72E+02 8.00E+03 6.25E+02 5.18E+04 2.15E+02 1.94E+03 4.48E+02 9.11E+03 8.53E+01 3.90E+04 2.44E+02 20 10 9.73E+05 3.72E+05 2.38E+03 6.28E+02 4.95E+02 3.41E+01 7.18E+03 3.50E+02 8.58E+03 3.33E+03 1.10E+03 2.65E+02 2.72E+04 5.03E+03 1.11E+04 3.61E+03 4.89E+03 2.41E+02 3.65E+04 1.20E+03 20 30 2.31E+06 1.25E+06 1.88E+03 1.06E+03 1.91E+03 1.18E+03 4.59E+03 1.25E+03 9.58E+03 4.25E+03 1.23E+04 4.65E+03 2.68E+04 3.22E+03 1.09E+04 7.66E+03 7.68E+03 2.31E+02 2.16E+04 4.99E+03 20 120 6.64E+04 2.03E+03 7.64E+02 1.97E+02 4.59E+02 9.44E+01 1.44E+03 1.45E+02 2.78E+03 1.10E+02 1.08E+04 8.47E+02 1.73E+03 2.49E+02 2.63E+03 5.89E+01 7.77E+03 7.98E+02 20 480 2.06E+05 1.66E+02 2.67E+02 1.71E+01 4.34E+01 4.38E+00 4.44E+03 1.44E+02 2.98E+03 1.72E+01 7.76E+03 2.34E+02 1.13E+04 1.19E+02 1.45E+03 8.19E+00 9.78E+03 1.42E+02 Table 4-­‐2: Site Specific Absolute Quantification Data. Table continued on next two pages. 155 Ligand Unt. AREG AREG AREG AREG AREG AREG AREG AREG AREG AREG AREG AREG EGF EGF EGF EGF EGF EGF EGF EGF EGF EGF EGF EGF TGFa TGFa TGFa TGFa TGFa TGFa TGFa TGFa TGFa TGFa TGFa TGFa 0 1 1 1 1 1 1 20 20 20 20 20 20 1 1 1 1 1 1 20 20 20 20 20 20 1 1 1 1 1 1 20 20 20 20 20 20 Dose%(nM) ERK1%pT202pY204 ERK2%pY187 ERK2%pT185pY187 Gab1%pY259 Gab1%pY373m Gab1%pY637 Gab1%pY659 HER2%pY877 HER2%pY1023 HER2%pY1248 Time%(min.) Mean Range Mean Range Mean Range Mean Range Mean Range Mean Range Mean Range Mean Range Mean Range Mean Range 0 2.19E+03 1.81E+02 1.07E+03 2.03E+02 1.36E+03 2.68E+02 7.12E+02 8.05E+01 2.48E+03 2.71E+02 7.02E+02 1.79E+02 1.62E+04 3.41E+03 2.21E+02 1.29E+02 1.60E+04 6.61E+03 2.75E+03 5.37E+02 1 5.62E+02 1.82E+02 1.09E+03 1.96E+01 5.72E+02 8.19E+01 2.90E+03 9.74E+02 8.67E+02 2.76E+01 1.15E+03 2.23E+02 1.31E+04 1.78E+03 7.30E+02 2.38E+02 2.48E+04 4.30E+03 6.62E+03 6.78E+02 5 8.36E+04 3.02E+03 2.25E+03 4.65E+02 1.73E+03 2.19E+02 1.25E+03 9.76E+01 3.49E+03 3.40E+02 2.01E+04 5.46E+03 3.94E+02 8.21E+01 9.56E+03 2.20E+02 6.31E+03 1.78E+03 10 6.24E+02 5.41E+01 1.68E+05 3.94E+03 2.21E+03 1.48E+01 2.29E+03 1.46E+02 4.79E+02 5.10E+01 2.18E+03 1.58E+02 6.36E+04 4.57E+03 2.16E+02 1.10E+01 8.98E+03 4.17E+02 1.91E+03 7.66E+01 30 1.30E+03 1.12E+02 2.27E+04 2.10E+03 1.86E+03 9.05E+01 4.94E+02 7.91E+01 1.45E+03 2.21E+02 3.08E+03 1.95E+02 1.13E+04 1.00E+02 7.41E+02 1.32E+02 2.04E+04 5.60E+03 3.79E+03 1.16E+03 120 1.57E+03 2.64E+02 5.40E+03 4.61E+01 3.09E+03 2.52E+02 8.51E+02 2.32E+02 1.26E+03 7.49E+01 5.76E+02 1.33E+02 1.43E+04 1.21E+03 6.72E+02 8.27E+01 1.48E+04 6.36E+03 3.66E+03 6.17E+02 480 1.05E+03 1.88E+01 4.06E+04 5.25E+02 1.96E+03 2.13E+01 4.36E+03 5.77E+02 3.44E+03 2.25E+00 1.98E+03 1.41E+02 9.22E+03 2.54E+02 5.93E+02 5.61E+01 1.41E+04 2.53E+03 3.92E+03 5.31E+02 1 4.67E+02 6.54E+01 1.14E+03 6.08E+00 5.17E+02 5.27E+01 2.86E+03 3.09E+02 9.61E+02 1.38E+01 2.49E+03 1.92E+02 1.40E+04 1.19E+03 5.89E+02 1.45E+02 1.45E+04 2.97E+03 5.94E+03 7.09E+02 5 8.50E+04 5.53E+03 1.70E+03 2.17E+02 1.74E+03 3.28E+02 1.16E+03 4.38E+01 3.26E+03 1.23E+03 1.63E+04 3.87E+03 2.92E+02 7.78E+01 8.95E+03 1.96E+02 4.22E+03 1.32E+02 10 9.78E+02 6.85E+01 1.88E+05 8.75E+03 2.66E+03 4.59E+01 3.62E+03 1.96E+02 7.53E+02 8.52E+01 4.42E+03 1.79E+02 1.21E+05 9.52E+03 3.06E+02 3.27E+01 3.48E+04 7.63E+02 2.72E+03 3.83E+01 30 1.06E+03 2.46E+02 2.22E+04 2.43E+03 1.53E+03 5.86E+01 4.36E+02 9.79E+01 1.39E+03 1.53E+02 2.32E+03 4.81E+02 1.19E+04 2.30E+03 3.36E+02 6.22E+01 2.16E+04 7.01E+03 4.00E+03 2.27E+02 2.51E+03 4.90E+02 7.21E+03 3.48E+02 3.49E+03 3.47E+02 8.45E+02 1.86E+02 1.26E+03 5.45E+01 1.60E+03 3.21E+02 1.30E+04 4.88E+02 7.80E+02 5.56E+01 2.02E+04 9.82E+03 4.78E+03 5.99E+02 120 480 1.19E+03 2.84E+01 4.10E+04 7.30E+02 1.50E+03 8.69E+01 3.09E+03 1.77E+02 3.17E+03 4.95E+01 1.44E+03 1.40E+01 7.56E+03 9.26E+01 3.77E+02 1.97E+01 1.34E+04 1.81E+03 4.65E+03 1.48E+02 1 5.62E+03 9.02E+02 2.28E+03 2.08E+02 2.41E+03 5.58E+02 8.39E+04 1.13E+04 9.67E+02 2.76E+00 1.39E+03 2.14E+02 1.17E+05 2.44E+04 2.92E+03 7.18E+02 5 1.80E+03 2.91E+02 7.59E+04 8.78E+03 2.99E+03 6.50E+02 6.43E+04 1.29E+04 2.59E+03 2.07E+02 3.08E+03 5.54E+02 7.52E+04 2.78E+04 1.72E+03 4.29E+02 4.14E+03 4.14E+03 3.95E+03 8.08E+02 10 1.59E+03 2.03E+02 6.66E+04 1.20E+03 2.46E+03 4.05E+01 9.95E+03 1.46E+03 1.25E+03 2.71E+01 2.90E+03 1.75E+02 7.50E+03 1.28E+02 8.61E+02 5.73E+01 1.16E+04 8.31E+02 3.08E+03 2.91E+02 30 4.47E+02 6.43E+01 1.68E+05 2.80E+03 3.29E+03 1.99E+02 5.26E+03 2.44E+03 4.68E+03 1.60E+02 1.78E+03 3.66E+02 2.61E+04 1.80E+03 9.22E+02 2.17E+02 2.44E+03 1.07E+02 120 2.59E+02 2.35EI01 2.22E+04 1.88E+03 1.93E+03 4.83E+01 2.31E+04 8.31E+02 4.72E+02 1.55E+01 1.01E+03 2.55E+01 2.45E+04 8.25E+02 1.87E+02 1.47E+00 2.44E+03 1.01E+02 480 7.58E+02 1.02E+01 5.67E+04 3.05E+03 1.78E+03 1.41E+02 1.93E+03 9.04E+01 2.79E+03 7.91E+01 2.67E+04 4.23E+02 3.09E+02 1.12E+01 1.66E+04 6.61E+02 2.91E+03 5.22E+02 1 1.41E+03 1.03E+02 4.76E+04 2.05E+03 2.61E+03 7.33E+01 1.14E+04 1.45E+02 3.19E+03 3.81E+02 4.68E+03 6.33E+01 3.19E+04 2.06E+02 1.38E+02 1.15E+01 3.75E+03 6.02E+02 3.90E+03 3.22E+02 5 1.76E+04 1.22E+03 7.33E+04 8.42E+02 5.65E+03 2.62E+02 1.07E+04 3.95E+02 1.06E+03 1.95E+01 3.39E+03 2.55E+02 2.51E+04 2.17E+03 3.84E+02 2.22E+00 2.49E+03 1.38E+02 8.59E+03 5.62E+02 10 1.72E+04 1.60E+03 1.30E+05 1.78E+03 4.68E+03 1.45E+02 6.62E+04 3.94E+02 1.46E+03 1.07E+02 2.17E+03 2.70E+02 5.04E+04 6.49E+03 6.85E+01 6.79E+00 1.01E+04 1.32E+03 3.85E+03 2.12E+01 30 2.23E+03 4.50E+02 4.69E+04 3.55E+03 3.77E+03 1.28E+03 1.02E+04 4.43E+03 7.80E+02 4.31E+00 1.44E+03 1.91E+01 2.57E+04 5.76E+02 1.93E+02 1.12E+02 1.37E+04 3.84E+03 1.85E+03 1.04E+03 120 1.50E+03 3.78E+02 1.17E+04 5.81E+02 1.07E+03 1.93E+02 3.63E+04 1.09E+04 3.40E+03 1.03E+03 3.55E+04 5.38E+03 8.94E+01 1.41E+01 1.69E+03 5.72E+02 480 5.07E+03 5.13E+02 2.29E+04 1.49E+03 1.98E+03 3.12E+01 1.78E+03 1.81E+02 1.17E+03 2.64E+00 1.74E+03 8.49E+01 2.19E+04 3.46E+02 4.24E+01 2.34E+00 8.60E+03 1.21E+02 3.62E+03 1.22E+02 1 7.48E+03 1.62E+03 2.51E+03 3.50E+02 3.58E+03 8.06E+02 9.63E+04 2.12E+04 1.14E+03 2.38E+01 1.90E+03 4.14E+02 1.32E+05 2.69E+04 5.52E+03 1.28E+03 5 3.24E+03 5.43E+02 1.94E+05 4.08E+04 5.89E+03 1.48E+03 1.00E+05 7.20E+03 3.50E+03 2.67E+02 4.14E+03 5.52E+02 1.62E+05 6.42E+04 2.74E+03 1.02E+02 5.55E+03 5.55E+03 3.49E+03 4.59E+02 10 1.80E+03 3.73E+02 6.94E+04 8.73E+02 3.47E+03 8.36E+02 1.17E+04 2.71E+03 1.30E+03 3.74E+01 2.96E+03 2.54E+01 6.87E+03 1.83E+02 1.04E+03 3.07E+02 1.01E+04 3.72E+02 3.01E+03 5.56E+02 30 5.02E+02 8.10E+00 1.69E+05 5.99E+03 4.49E+03 7.37E+02 8.81E+03 6.56E+02 4.74E+03 2.05E+02 1.99E+03 2.52E+02 3.82E+04 2.59E+02 1.23E+03 4.48E+01 5.54E+03 5.60E+01 120 2.20E+02 1.83E+01 2.60E+04 1.53E+03 2.32E+03 1.77E+02 2.67E+04 1.63E+03 4.32E+02 4.16E+01 9.13E+02 1.20E+02 2.70E+04 1.37E+03 2.06E+02 9.26E+00 3.15E+03 1.05E+02 480 9.47E+02 7.53E+01 8.24E+04 7.17E+03 1.99E+03 9.92E+01 2.60E+03 5.46E+01 1.87E+03 1.21E+02 2.86E+04 1.90E+02 2.85E+02 1.86E+01 9.90E+03 8.74E+02 3.14E+03 4.88E+02 1 1.52E+03 7.81E+01 4.77E+04 3.35E+03 2.93E+03 5.55E+01 1.10E+04 3.13E+02 3.12E+03 4.16E+02 4.34E+03 2.25E+02 2.90E+04 5.06E+02 2.00E+02 2.25E+01 3.09E+03 4.53E+02 5.11E+03 5.24E+02 5 1.95E+04 1.58E+03 7.29E+04 2.97E+03 5.60E+03 4.03E+02 1.21E+04 3.89E+02 1.12E+03 5.72E+01 3.89E+03 7.06E+01 2.11E+04 4.54E+02 4.61E+02 3.38E+01 2.42E+03 1.34E+02 8.32E+03 2.31E+02 10 1.96E+04 4.55E+02 1.70E+05 1.80E+03 5.20E+03 2.41E+02 1.00E+05 2.72E+03 1.42E+03 9.41E+00 2.38E+03 2.85E+02 6.19E+04 2.37E+03 8.54E+01 9.14E+00 1.43E+04 6.55E+02 3.57E+03 5.69E+02 30 6.56E+03 3.37E+03 7.74E+04 4.93E+03 1.32E+04 6.95E+03 3.85E+04 2.51E+04 9.56E+02 1.24E+01 3.11E+03 6.54E+02 3.22E+04 1.52E+03 7.18E+02 4.72E+02 2.40E+04 1.21E+04 6.54E+03 3.67E+03 120 1.87E+03 2.83E+02 1.81E+04 9.11E+02 1.26E+03 3.29E+02 3.87E+04 7.10E+03 3.88E+03 6.62E+02 4.45E+04 2.25E+03 1.47E+02 1.12E+01 2.99E+03 8.04E+02 480 4.71E+03 4.31E+02 2.34E+04 1.82E+03 1.90E+03 2.35E+01 1.70E+03 1.20E+02 1.13E+03 2.03E+01 1.58E+03 6.82E+01 2.13E+04 1.12E+02 3.96E+01 8.46EI01 6.83E+03 4.38E+01 3.31E+03 6.12E+01 Ligand Unt. AREG AREG AREG AREG AREG AREG AREG AREG AREG AREG AREG AREG EGF EGF EGF EGF EGF EGF EGF EGF EGF EGF EGF EGF TGFa TGFa TGFa TGFa TGFa TGFa TGFa TGFa TGFa TGFa TGFa TGFa IRS2%pY675m% IRS2%pY742 IRS2%pY823% Shp2%pY62 Src%pY418 Shc%pY240m Shc%pY239pY240m Shc%pY317 Dose%(nM) Time%(min.) Mean Range Mean Range Mean Range Mean Range Mean Range Mean Range Mean Range Mean Range 0 0 8.24E+03 1.47E+03 1.06E+03 1.00E+02 1.78E+05 1.74E+04 4.46E+03 4.46E+02 1.39E+03 2.33E+02 2.10E+03 3.85E+02 2.08E+03 4.75E+02 1.65E+03 1.83E+02 1 1 1.43E+05 2.96E+03 2.61E+03 2.97E+02 3.36E+04 8.55E+03 1.05E+04 2.67E+03 4.45E+03 3.99E+02 5.02E+03 3.52E+02 1.89E+04 4.53E+03 4.23E+04 3.19E+03 1 5 5.71E+04 3.66E+03 8.39E+02 6.43E+01 3.95E+04 4.50E+03 5.92E+03 1.09E+03 2.37E+03 3.13E+01 2.21E+03 1.82E+02 3.20E+04 1.64E+03 4.18E+04 2.31E+03 1 10 3.29E+04 4.73E+03 3.51E+02 4.76E+01 1.67E+04 2.13E+03 7.07E+03 4.04E+02 2.53E+03 1.32E+02 3.97E+03 4.07E+02 4.20E+04 4.66E+02 4.03E+04 5.45E+01 1 30 6.89E+04 5.69E+03 2.43E+02 2.26E+01 2.95E+05 4.79E+03 4.83E+03 9.66E+02 1.39E+03 2.37E+01 5.95E+03 2.36E+02 1.59E+04 1.58E+03 2.11E+04 1.10E+03 1 120 3.80E+03 1.67E+02 2.01E+02 1.70E+01 1.56E+05 2.28E+04 2.75E+03 4.69E+01 1.24E+03 7.96E+01 8.10E+03 4.71E+01 1.06E+04 1.35E+03 1.21E+04 8.18E+02 1 480 1.03E+05 7.01E+03 2.70E+02 3.84E+01 8.15E+03 1.10E+03 3.61E+03 2.95E+02 1.42E+03 7.18E+00 3.58E+03 2.75E+02 1.00E+04 1.77E+02 3.92E+03 2.22E+02 20 1 1.58E+05 3.79E+03 2.49E+03 9.61E+01 3.39E+04 2.93E+03 1.04E+04 1.55E+03 3.83E+03 2.78E+02 8.41E+03 5.32E+02 4.59E+04 1.15E+04 1.98E+05 2.36E+04 20 5 5.71E+04 1.17E+03 1.05E+03 2.69E+01 4.92E+04 7.87E+03 4.88E+03 1.12E+03 2.18E+03 1.80E+02 3.55E+03 3.94E+02 3.78E+04 9.46E+02 8.34E+04 1.15E+04 20 10 6.68E+04 2.23E+02 4.68E+02 3.85E+01 2.99E+04 3.66E+03 8.23E+03 5.98E+02 3.00E+03 1.26E+02 8.11E+03 5.89E+02 1.47E+05 1.15E+04 8.46E+04 1.34E+03 20 30 5.64E+04 1.75E+03 2.03E+02 7.28E+00 2.19E+05 6.07E+04 4.52E+03 7.91E+02 1.12E+03 1.48E+02 7.20E+03 9.06E+01 1.83E+04 4.43E+03 3.99E+04 4.57E+03 20 120 5.36E+03 1.95E+02 2.89E+02 2.05E+01 2.00E+05 4.21E+03 2.63E+03 1.64E+02 1.34E+03 3.42E+01 1.51E+04 5.59E+02 1.44E+04 2.12E+03 2.05E+04 1.82E+03 20 480 9.66E+04 3.10E+03 2.38E+02 3.18E+01 7.34E+03 3.20E+02 3.68E+03 6.30E+01 1.70E+03 3.21E+01 6.15E+03 8.84E+02 2.04E+04 1.55E+03 1.45E+04 6.52E+02 1 1 2.60E+04 4.94E+03 9.06E+02 4.77E+02 3.47E+04 7.44E+02 2.60E+03 2.60E+02 2.33E+03 4.90E+02 1.04E+04 1.18E+03 3.33E+04 4.27E+03 1 5 3.05E+04 1.46E+03 7.62E+04 7.51E+03 4.15E+03 5.14E+02 1.09E+03 1.78E+02 7.80E+03 1.43E+03 2.46E+04 2.35E+03 5.15E+04 4.24E+03 1 10 6.03E+04 6.70E+03 9.19E+00 7.27EK01 2.90E+05 1.12E+04 4.09E+03 6.04E+02 1.41E+03 1.08E+02 4.48E+03 8.81E+02 4.02E+04 1.08E+04 6.67E+04 2.33E+03 1 30 4.45E+03 2.85E+02 2.96E+05 4.48E+04 4.47E+03 6.83E+02 1.66E+03 2.58E+02 7.54E+03 1.30E+03 1.32E+04 7.26E+02 8.70E+04 4.41E+03 1 120 5.31E+04 5.49E+03 4.23E+02 1.94E+01 6.20E+03 2.93E+02 2.44E+03 6.61E+01 9.66E+03 3.36E+02 2.62E+04 4.68E+01 1.10E+04 2.01E+03 1 480 1.64E+04 1.79E+02 8.08E+01 6.53E+00 1.03E+05 1.63E+03 4.09E+03 5.20E+01 1.50E+03 3.57E+01 1.32E+04 7.00E+02 2.71E+04 5.34E+03 1.53E+04 1.46E+03 20 1 6.93E+04 5.82E+03 2.80E+02 2.03E+01 2.87E+05 2.62E+04 5.89E+03 3.04E+02 1.22E+03 6.01E+01 1.49E+04 1.88E+03 2.45E+04 2.75E+03 1.75E+05 1.57E+04 20 5 3.94E+04 2.79E+03 2.97E+01 1.43E+00 6.56E+04 1.37E+03 3.51E+03 1.99E+02 1.82E+03 7.81E+01 1.09E+04 4.60E+02 4.10E+04 2.03E+03 1.92E+05 9.21E+03 20 10 7.48E+04 1.61E+04 1.66E+05 4.97E+03 5.22E+03 4.33E+01 1.93E+03 2.56E+02 5.84E+03 4.59E+02 5.46E+04 2.72E+03 1.00E+05 1.91E+03 20 30 3.37E+04 5.91E+03 4.66E+02 1.29E+02 2.51E+04 2.53E+03 5.74E+03 5.31E+02 1.30E+03 3.17E+01 1.76E+04 5.37E+03 2.08E+04 2.23E+03 7.92E+04 1.64E+03 20 120 3.00E+03 4.03E+02 6.78E+01 1.95E+01 3.24E+03 3.59E+02 2.75E+03 2.88E+02 5.88E+03 3.25E+02 2.15E+04 1.48E+03 2.17E+04 1.44E+03 20 480 3.93E+04 2.23E+03 2.66E+02 4.40E+00 2.15E+04 1.02E+03 8.69E+03 6.19E+02 1.39E+03 6.60E+01 1.12E+04 2.91E+02 2.82E+04 3.23E+03 2.88E+04 2.41E+03 1 1 3.51E+04 1.55E+03 7.03E+02 7.03E+02 4.81E+04 3.37E+03 3.77E+03 4.20E+02 2.40E+03 3.35E+02 1.84E+04 2.18E+03 3.90E+04 4.48E+03 1 5 5.27E+04 1.13E+03 1.92E+05 3.60E+04 4.98E+03 2.06E+02 1.64E+03 1.91E+02 9.98E+03 1.16E+03 3.02E+04 3.50E+03 5.36E+04 8.69E+03 1 10 6.16E+04 7.31E+03 9.25E+00 1.73E+00 3.09E+05 3.91E+04 4.07E+03 2.40E+02 1.57E+03 9.98E+01 4.96E+03 7.76E+02 3.89E+04 1.17E+04 4.90E+04 1.94E+03 1 30 4.84E+03 3.34E+02 2.12E+05 2.91E+04 4.50E+03 6.64E+02 1.71E+03 2.89E+02 9.76E+03 4.19E+02 2.07E+04 2.25E+02 5.50E+04 9.68E+02 1 120 4.51E+04 2.61E+03 3.41E+02 1.57E+01 6.13E+03 3.11E+02 2.25E+03 1.15E+02 8.26E+03 9.19E+02 1.94E+04 2.20E+02 1.24E+04 1.99E+03 1 480 1.78E+04 3.57E+02 8.51E+01 2.88E+00 7.62E+04 1.50E+02 4.65E+03 7.37E+00 1.29E+03 3.55E+01 6.82E+03 3.77E+02 1.43E+04 2.56E+02 1.62E+04 1.25E+03 20 1 7.59E+04 6.75E+03 2.41E+02 2.04E+01 2.96E+05 3.79E+04 6.39E+03 1.51E+02 1.26E+03 1.17E+02 1.42E+04 9.74E+02 2.57E+04 3.30E+03 1.66E+05 1.76E+04 20 5 4.03E+04 1.13E+03 3.12E+01 2.93E+00 6.95E+04 3.41E+03 3.94E+03 9.59E+01 1.77E+03 3.57E+01 9.99E+03 1.28E+02 4.27E+04 2.20E+02 1.89E+05 2.94E+03 20 10 8.00E+04 1.43E+04 1.67E+05 1.12E+04 5.22E+03 2.40E+02 2.36E+03 6.68E+01 9.19E+03 2.87E+02 4.81E+04 4.98E+02 8.15E+04 3.20E+03 20 30 4.67E+04 6.75E+03 6.00E+02 2.61E+02 5.78E+04 2.76E+04 9.18E+03 1.71E+03 1.50E+03 1.18E+02 4.01E+04 1.26E+04 2.52E+04 2.59E+03 9.14E+04 2.09E+03 20 120 3.32E+03 4.14E+02 7.45E+01 9.67E+00 4.14E+03 4.25E+02 2.40E+03 4.43E+01 1.32E+04 3.26E+02 4.08E+04 4.30E+03 2.43E+04 5.86E+02 20 480 3.70E+04 1.65E+03 2.56E+02 1.74E+01 2.27E+04 9.60E+02 8.69E+03 5.01E+02 1.31E+03 6.74E+01 1.13E+04 2.08E+02 3.47E+04 2.78E+03 3.38E+04 1.83E+03 156 4.7 Bibliography 1. Bhargava, R. et al. EGFR gene amplification in breast cancer: correlation with epidermal growth factor receptor mRNA and protein expression and HER-2 status and absence of EGFR-activating mutations. Mod Pathol 18, 1027–1033 (2005). 2. Rimawi, M. F. et al. Epidermal growth factor receptor expression in breast cancer association with biologic phenotype and clinical outcomes. Cancer 116, 1234– 1242 (2010). 3. Citri, A. & Yarden, Y. EGF–ERBB signalling: towards the systems level. Nat Rev Mol Cell Biol 7, 505–516 (2006). 4. Roepstorff, K. et al. Differential Effects of EGFR Ligands on Endocytic Sorting of the Receptor. Traffic 10, 1115–1127 (2009). 5. Wilson, K. J., Gilmore, J. L., Foley, J., Lemmon, M. A. & Riese, D. J. Functional selectivity of EGF family peptide growth factors: implications for cancer. Pharmacol. Ther. 122, 1–8 (2009). 6. Lemmon, M. A. & Schlessinger, J. Cell signaling by receptor tyrosine kinases. Cell 141, 1117–1134 (2010). 7. Lemmon, M. A., Schlessinger, J. & Ferguson, K. M. The EGFR Family: Not So Prototypical Receptor Tyrosine Kinases. Cold Spring Harbor Perspectives in Biology 6, a020768–a020768 (2014). 8. Pawson, T. & Nash, P. Protein-protein interactions define specificity in signal transduction. Genes Dev. 14, 1027–1047 (2000). 9. Wilson, K. J. et al. EGFR ligands exhibit functional differences in models of paracrine and autocrine signaling. Growth Factors 30, 107–116 (2012). 10. Hibi, M. & Hirano, T. Gab-family adapter molecules in signal transduction of cytokine and growth factor receptors, and T and B cell antigen receptors. Leuk. Lymphoma 37, 299–307 (2000). 11. Adams, S. J., Aydin, I. T. & Celebi, J. T. GAB2--a scaffolding protein in cancer. Mol. Cancer Res. 10, 1265–1270 (2012). 12. Sakaguchi, K. et al. Shc phosphotyrosine-binding domain dominantly interacts with epidermal growth factor receptors and mediates Ras activation in intact cells. Mol. Endocrinol. 12, 536–543 (1998). 157 13. Myers, M. G. et al. IRS-1 activates phosphatidylinositol 3'-kinase by associating with src homology 2 domains of p85. Proc. Natl. Acad. Sci. U.S.A. 89, 10350– 10354 (1992). 14. Scheck, R. A., Lowder, M. A., Appelbaum, J. S. & Schepartz, A. Bipartite Tetracysteine Display Reveals Allosteric Control of Ligand-Specific EGFR Activation. ACS Chem. Biol. 7, 1367–1376 (2012). 15. Alvarado, D., Klein, D. E. & Lemmon, M. A. Structural Basis for Negative Cooperativity in Growth Factor Binding to an EGF Receptor. Cell 142, 568–579 (2010). 16. Brugge Lab Protocols. at <http://brugge.med.harvard.edu/Protocols/Protocols.html> 17. Roskoski, R. MEK1/2 dual-specificity protein kinases: structure and regulation. Biochem. Biophys. Res. Commun. 417, 5–10 (2012). 18. Roskoski, R. ERK1/2 MAP kinases: structure, function, and regulation. Pharmacol. Res. 66, 105–143 (2012). 19. Jones, R. B., Gordus, A., Krall, J. A. & MacBeath, G. A quantitative protein interaction network for the ErbB receptors using protein microarrays. Nature 439, 168–174 (2005). 158 5 Conclusions and Future Directions The work in this thesis has focused on improving mass spectrometry-based proteomic data sets in two key ways. The first is to enhance the quality of data generated with exploratory MS/MS analysis – a crucial first step in any successful proteomics endeavor. These experiments seek the full range of proteins or sites of post-translational modification in the samples of interest. With recent advancements in throughput and analysis it has become infeasible for users to fully validate each scan. Instead, databases are used to pre-match spectra to known peptides in the proteome and discard the rest. Such database searches have become commonplace, but their role as a pre-check is being forgotten; instead, they are being used as the sole mode of identification. A trained mass spectrometrist rarely double-checks the vast majority of identifications put forth by these algorithms. In place of human judgment, statistical methods based on analytically calculated false discovery rates (FDRs) provide estimated accuracy of the data set as a whole. Applications exist where this may be appropriate, but time-intensive biological follow-up experiments need the added certainty that a particular match is indeed correct. It is with this in mind that the Computer-Assisted Manual Validation (CAMV) package was developed. The goal was to remove the time-consuming tasks of filtering or matching peaks in a mass spectrum and engage the trained user only at the final decisionmaking stage; the tasks are divided so that each entity performs those for which it is best suited. The computer rapidly harvests the relevant information and presents it in a comprehensive way and the user takes all of this information into account to make the best decision about whether the pre-matched assignment is correct. The division of labor has produced drastic decreases in the time required to validate a proteomic mass spectrometry experiment. The second method for improving data quality focused on the development of a highthroughput reliable absolute quantification pipeline for proteomics by mass spectrometry. In general, throughput improvements for these experiments come from differential labeling techniques that permit multiple samples to be simultaneously analyzed and later for the data to be deconvolved. Such methods involve heavy-isotope labeling of either covalently attached tags or the peptides themselves. Absolute quantification relies on the calibration of observed signal intensity to the amount of peptide present in the original 159 sample. Using standard peptides that are included directly into the sample provides the safest way to account for sequence-specific purification biases. The method introduced includes such standard peptides while simultaneously employing a multiplex labeling technique. The added benefit to this strategy is that multiple doses of standard peptide may be included into a single analysis, providing a more comprehensive calibration curve. This method was tested on two instrument platforms. The presumed “gold standard” triple quadrupole instrument utilized for multiple reaction monitoring provided a very thorough quality control against which the orbitrap-based full scan method could be validated. The two platforms both performed exceedingly well in the comparison studies, providing assurance that the full scan method could be utilized for further analysis. The full scan method was preferable for several reasons. First, it allows more comprehensive after-the-fact confirmation of sequence identity. Second, due to the instrument’s popularity, the technique can be readily reproduced by a large number of research groups. As a test case, ligand-induced EGFR tyrosine phosphorylation was used to show the power of absolute quantification in deciphering modalities of information transfer in signaling networks. Different ligands are known to produce varied phenotypic responses despite binding to the same receptor. Hence, ligand identity and/or concentration must be transmitted through the plasma membrane by the receptor. The interface between the receptor and intracellular proteins is the set of phosphorylation sites at the C-terminal tail. Once phosphorylated, these provide docking sites for intracellular proteins, recruiting them into signaling complexes to further relay information. The encoding of the information may occur in one of two ways. Either the pattern or the magnitude of phosphorylation may be altered depending on the identity of the ligand. To better understand which of these mechanisms was at play, phosphorylation sites on EGFR, known and putative interacting proteins, and downstream effectors were quantified using this technique. Initially, a saturating dose of ligand was presented to cells and the sitespecific phosphorylation was measured. Surprisingly, the data were consistent with a pattern modulation modality. All three ligands presented indistinguishable patterns of EGFR phosphorylation, but AREG was noticeably weaker than TGFα or EGF. Interestingly, the pattern of phosphorylation on adaptor proteins was different depending 160 on ligand. This suggested that amplitude differences produced sufficiently different binding site availability at the receptor to affect adaptor protein recruitment and phosphorylation. A follow-up experiment with sub-saturating dose ligand treatment told a different story. These low doses permitted the receptor to establish different patterns of phosphorylation, which naturally also lead to differential phosphorylation patterns on adaptor proteins. The applicability of these tools is not limited to the cases described above. Rapid manual validation of proteomic mass spectrometry data will provide more reliable conclusions from initial exploratory experiments. Any application of proteomics will benefit from more effective initial screening of target proteins or potential pharmaceutical compounds. Moving forward to targeted experiments or biological follow-up is a time-consuming and expensive process, so decreasing the false positive rate in proteomic screens will aid academic and industry researchers alike. Absolute quantification paves the way for proteomics to address a whole new class of biological questions. First, simulation-based computation modeling efforts are much better informed by abundance measurements than they are by relative changes. Natural representations of signaling networks choose concentrations as variables and rate equations as parameters. Second, stoichiometric understanding of cellular processes is not possible without absolute quantification. This plays a very important role when considering the fraction of molecules that are affected by a treatment condition or potential therapeutic compound. Third, more experimental conditions can be accurately compared. Absolute quantification provides a very straightforward way to account for more conditions than a multiplex technique would support. Data from multiple experiments can be directly combined and compared without fear of batch effects. The tools developed in this thesis have set the stage for a number of applications in the very near future. The first is analysis of signaling networks under the influence of overexpressed oncogenes. For example, HER2 is a member of the EGFR-family of receptor tyrosine kinases and is routinely overexpressed in many types of cancer. The increased kinase activity contributed by the overabundance of this protein presumably amplifies the signaling events throughout the network and biases the cells towards more 161 extreme phenotypic responses. These phenotypic responses have been observed on the large scale as increased tumor aggressiveness and invasiveness, contributing to decreased patient survival. With absolute quantification, a richer picture of the signaling abnormalities will be possible, which may inform more effective pharmaceutical intervention. The second is analysis of the effects of pharmaceutical interventions on cellular signaling. An ideal pharmaceutical compound would voraciously destroy cells with dominant aberrant signaling and leave normal cells untouched. This ideal is almost never reached and instead researchers must settle for understanding how the compound impinges on both normal and tumor cell populations. Studying the signaling in normal cells paints a picture of how the rest of the patient’s tissues will respond, while cells with altered signaling networks will simulate tumor tissues. A deeper level of analysis is also afforded by the studies of altered cells. Oftentimes putative pharmaceutical compounds do not attain the predicted efficacy once they are moved into animal models. In-depth studies of the signaling responses for these altered cells gives researchers clues as to how the cells are compensating. The worst cases are those where the disease develops resistance after an initial stage of promising remission. Even though the vast majority of the original population of cells may be sensitive to the drug, the small minority may now grow out and take over, leaving predominantly resistant tumor cells. The tools developed here will also aid in the analysis of more directed signaling measurements following stimulation. Previously, the resolution of time points was limited by technical variation inherent in the sample preparation protocol. Recent improvements to the protocol involve flash-freezing samples to ensure more exact endpoints. These techniques are ushering in experiments aimed at very short time points following stimulation to permit very early dynamics of the system to be distinguished. These dynamics will likely involve phosphorylation sites on the receptor itself and only the most closely associated adaptor proteins. Secondary or tertiary signaling molecules are left out of the picture. Furthermore, feedback loops responsible for downstream regulation will also presumably take longer to initiate, thus distinguishing their contributions. Such fine-scale measurements of the network have not been made previously and hold significant promise of a more refined network model of early events in the receptor-initiated signaling cascade. 162 Studying cellular signaling networks and their constituent proteins holds great potential for understanding disease mechanisms. It is the hope of the author that the work in this thesis will facilitate discoveries far beyond those discussed or even envisioned. 163