Protein Mass Spectrometry Applications Katheryn Resing & Natalie Ahn • Proteomics profiling • Deuterium exchange MS Lewis, T.S., et al. (2001) Identification of Novel MAP kinase pathway signaling targets by functional proteomics and mass spectrometry. Molecular Cell. 6:1-20. Identifying Markers in Cancer Progression – Melanoma Analysis of melanocyte and melanoma lines by 2D-PAGE/MS Cell Line Melanocyte 71 78 AJCC/Pathology/Level/thickness Normal Normal Status Stage X, RGP-VGP DOD 1.3% Early primary melanoma RGP WM3208V SBCL2 WM35 WM1789 WM1552C Stage I, RGP-VGP/ Level II/0.69 Stage I, RGP/ Level III/0.82 Stage 3, RGP-VGP/ Level IV/5.9 NED NED DOD 1.0% 0.1% 1.0% Primary melanoma VGP WM75 WM278 WM115 WM793B Stage 3, RGP-VGP/ Level IV/6.3 Stage 2, VGP/ Level IV/3.7 Stage N/A, VGP/ Level III/2.2 Stage 1, VGP/ Level II/0.55 DOD DOD DOD NED 10.8% 4.2% 6.0% 3.7% Metastatic Lu451 WM1232 WM852 WM1617 WM239A 1205Lu HS294T A375 Soft Agar 2% 17% 23% 90% 25% Cell lines from Meenhard Herlyn (Wistar Inst.), ATCC Hepatoma-derived growth factor is elevated in primary melanoma HDGF Melanocyte 0.2 0.15 0.1 0.05 WM115 (VGP) 0 C D E F Mcyte RGP G H I J K L M N VGP MET 1205Lu WM239A Lu451 WM1232 WM852 HS294T A375 WM1617 WM164 WM793 Metastatic WM115 WM278 WM39 WM75 VGP WM1552C WM1789 WM35 SBCL2 RGP WM3208 78T 78B 71T 71B Melanocyte A B HDGF Bernard, K., Litman, E., Fitzpatrick, J.L., Shellman, Y.G., Argast, G., Polvinen, K., Everett, A.D., Fukasawa, K, Norris, D.A., Ahn, N.G., Resing, K.A. (2003) Functional proteomic analysis of melanoma progression. Cancer Research. 63: 6716-6725. Anti-HDGF immunoreactivity in melanoma biopsies Advanced malignant melanoma (Clark’s IV) anti-HDGF (Vector Red) anti-S100 (DAB), anti-HDGF (VR) Early stage Lentigo melanoma (Clark’s II) Benign nevus (Non-transformed) PMA induced megakaryocyte differentiation Inhibitory interactions between Raf1 and Rac1 pathways Red - increased Blue - decreased Proteins responsive to active Rho GTPases Rac1-V12 * Stathmin Fatty acid binding protein 5 tRNA-Trp synthetase * Cofilin, Destrin Calponin Annexin 5 ATP synthase NAD isocitrate deH2 Adenylsuccinate lyase BC015471 BC001493 Unk 3, Unk 41 Cdc42-V12 BTF homologue Calcium regulatedheat stable protein Unk 38 Bip Cytoglobin Tropomyosin Tubulin cofactor Phosphatidylinositol -transfer protein Protein tyrosine phosphatase-1B AL137515 BC001763 RhoA-V14 Red: up-regulated proteins Blue: down-regulated proteins 2D-Western blots reveal regulated covalent modifications Trp - tRNA synthetase PTP-1B Phosphatidylinositol transfer protein b FABP5 CMV PTP-1B Rac Cdc42 RhoA A function for regulation of PTP1B by RhoA No reports showing regulation of PTP-1B by RhoA However, these two molecules share the same target: p130Cas PTP1B binds and dephosphorylates p130Cas RhoA promotes p130Cas Model: Rho inhibits phosphorylation p130Cas dephos’n by PTP1B Anti-pY blot RhoA PTP-1B p130 blot PTP1B Liu et.al (1996) JBC P p130Cas Tsuji et.al (2002) JCB RhoA inactivates PTP-1B. PTP1B modification correlates with p130Cas phosphorylation IP’d PTP1B cas PTP1B sp. act. p130 PTP1B-2D Westerns: phosphorylation PTP1B/p130 diverge from Rho-induced stress fiber formation F-actin PTP1B 2DWestern RhoA PTP1B P? Rho K P P P p130Cas FAK Migration MLC Paxillin P Focal adhesion complex Stress Fiber Proteomics studies with 2D gels provide information about potential markers and pathway behavior. On the other hand, . . . Results: MSPlus and Isoform Resolver Number peptides Unique peptides Sample 1: 16 SCX LC/MS/MS (2,117 files) MSPlus 856 (40%) 434 Sequest only 680 (32%) 351 Mascot only 702 (33%) 377 Proteins 243 209 219 Sample 2: 11 SCX LC/MS/MS x 10 gpf (47,598 files) MSPlus 8,190 (17%) 4,541 1,757 Sequest only 5,804 (12%) 3,385 1,433 Mascot only 5,173 (11%) 3,057 1,320 Sample 3: 7 gel filtration fx 14 SCX LC/MS/MS x 6 gpf (602,520 files) MSPlus 85,267 (18%) 20,675 5,130 Sequest only 64,194 (13%) 15,217 4,120 Mascot only 63,431 (13%) 16,006 3,971 Resing et al. Analytical Chemistry (in press) Overall Analytical Plan cells K562 +/-PMA +/-U0126 extract Soluble Soluble In or high Membrane Salt bound Wash in 0.42M NaCl, PBS & 50mM NaF, snapfreeze 25 mM b-glyceroIn N2 phosphate, DTT, EGTA, EDTA pH7.5 Sonicate ultracentrifuge Fractionate fractionate Proteins proteins gel filtration or Ion exchange proteolysis SCX LC/MS Digest with trypsin Reduce & alkylate WithPMA iodo- induced acetamide megakaryocyte differentiation 108 K562 cells produce 15-20 mg protein (Bradford) data analysis 13 gel filtration fractions (1,3,5,7,9,10,12,13) 14 SCX fractions:6 Gas Phase Fractions 300-1718 300-678 1390-1718 670-918 910-1038 1150-1398 1030-1158 1390-1718 1092 LC/MS runs 623,568 total MS sequencing files 46,616 peptides 14,775 unique peptides 1150-1398 1030-1158 910-1032 670-918 300-678 Continue panning on these—how many more can we identify? Distribution of XCORR scores for correctly vs incorrectly identified peptides 0.25 number of peptides Total 0.2 Incorrect Correct 0.15 0.1 XCorr 1.05 0.05 XCorr 3.4 0 0 2 4 6 XCORR scores 8 10 How many MS/MS files are assignable? Peptides from standard proteins: 46% validated assignments were above threshold 54% validated assignments were below threshold After Sequest, 523 of 2,117 DTAs (23%) were above XCorr threshold [XCorr > 3.0 (+1), 3.23(+2), 3.34(+3)] In Sample 1, 1,137 of 2,117 MS/MS files (54%) are expected to be assignable Repeated analysis improves sampling of peptides with scores above threshold Peptide assignments increase by >2x after 5 repeats Unique peptide sequences identified in three repeats of the same sample indicate there are two classes of MS/MS: those that are almost always identified, and those that are identified in a stochastic manner (coin toss model). HTT 33 24 HTH 27 HHT 87 19 HHH TTH THH 30 TTT ~25 26 THT Repeated analysis improves sampling of peptides with scores above threshold Peptide assignments increase by >2x after 5 repeats Variable scoring between Sequest and Mascot Scores indicated when the same sequence was assigned by both programs +1 ions 5 4 XCorr +2 ions 8 +3 ions 8 6 6 4 4 2 2 3 2 1 0 0 0 20 40 60 80 0 0 50 100 150 0 50 100 Mowse ~7.5% DTA files that failed Sequest (XCorr) were validated by high Mascot scores (Mowse) 150 Peptide elution from SCX is mostly dependent on # of basic residues Misassignments caused by “distraction” Database First Sequest XCorr RSP First Mascot Mowse Restricted (644) IPI (~48,000) IPI, no protease AIGTEPDSDVLSEIMHSFAK AIGTEPDSDVLSEIMHSFAK AIGTEPDSDVLSEIMHSFAK 1.98 1 1.98 1 1.98 422 AIGTEPDSDVLSEIMHSFAK 39.5 AIGTEPDSDVLSEIMHSFAK 39.5 TTIGAAGLPGRDGLPGPPGPPGPP 40.0 Restricted (644) IPI (~48,000) IPI, no protease EGLELPEDEEEK EGLELPEDEEEK EGIELLLNEGSEL 2.00 2.00 2.23 EGLELPEDEEEK EGLELPEDEEEK EGLELPEDEEEK 50.4 50.4 50.4 Restricted (644) IPI (~48,000) IPI, no protease GDAMIMEETGK YPILFLTQGK AVYVEMLQIL 0.74 1 1.11 1 1.34 12 GDAMIMEETGK GDAMIMEETGK GIMAIEMVEGE 41.4 41.4 43.9 Restricted (644) IPI (~48,000) IPI, no protease DLSLEEIQK DLSLEEIQK NSQVKELKQ 1.15 1 1.64 11 1.53 243 DLSLEEIQK IDCEAPLKK ALASQSAGITGV 25.2 27.7 31.5 1 1 2 Correct sequence assignments are replaced by incorrect assignments as database size increases Accuracy of the MSPlus approach: Expt 1: Manual analysis of 540 peptide assignments from Sample 1 (half accepted and half rejected, by panning method) 3.4% false positive assignments 8.0% false negative assignments Expt 2: Searching the randomized database: Sample 1: Soluble extract 16 SCX fr LC/MS/MS 4.0% false positives Sample 2: Soluble extract 11 SCX fr 10 gpf on LC/MS/MS 3.2% false positives Expt 3: Reproducibility in protein assignments between different analyses A protein database entry in FASTA format >IPI:IPI00027488.1 …. Alpha enolase (2-phospho-D-glycerate hydrolyase) BLink, Links MSILKIHAREIFDSRGNPTVEVDLFTSKGLFRAAVPSGASTGIYEALELR DNDKTRYMGKGVSKAVEHINKTIAPALVSKKLNVTEQEKIDKLMIEMDG TENKSKFGANAILGVSLAVCKAGAVEKGVPLYRHIADLAGNSEVILPVP AFNVINGGSHAGNKLAMQEFMILPVGAANFREAMRIGAEVYHNLKNVI KEKYGKDATNVGDEGGFAPNILENKEGLELLKTAIGKAGYTDKVVIGM DVAASEFFRSGKYDLDFKSPDDPSRYISPDQLADLYKSFIKDYPVVSIE DPFDQDDWGAWQKFTASAGIQVVGDDLTVTNPKRIAKAVNEKSCNCL LLKVNQIGSVTESLQACKLAQANGWGVMVSHRSGETEDTFIADLVVGL CTGQIKTGAPCRSERLAKYNQLLRIEEELGSKAKFAGRNFRNPLAK 140/(953+140) = 12.8% of hits with same peptide sequence have different protein database ID reference numbers. Some examples showing how Turbosequest and Mascot handle redundancy differently. Sequest Mascot ALAAAGYDVEK ALAAAGYDVEK ALAAAGYDVEK IPI00020958 IPI00020958 IPI00020958 IPI00020958 IPI00022360 IPI00028535 TIGGGDDSFNTFFSETGAGK TIGGGDDSFNTFFSETGAGK TIGGGDDSFNTFFSETGAGK IPI00022360 IPI00022360 IPI00022360 IPI00028535 IPI00028535 IPI00028535 Why does the software pick the L form over the I form? Both engines do this. TLTLVDTGIGMTK TLTLVDTGIGMTK IPI00024739 IPI00024739 IPI00024739 IPI00027749 TLTIVDTGIGMTK TLTIVDTGIGMTK IPI00047217 IPI00047217 IPI00047217 IPI00013921 Isoform Resolver “Peptide-centric” database: catalogues each unique peptide in the IPI database reports proteins redundantly associated with each peptide (1) (2) (3) (4) (5) IPI00023860 IPI00017763 IPI00180912 IPI00184769 IPI00185366 Protein Isoform (1) (1) (1) (1) (1) (1), (5) (1), (5) (1), (2), (3), (4) (2), (3) (2), (3) (2), (3) (2), (3) (2), (3), (4) (4) (45,374 Da) (42,823 Da) (44,630 Da) (39,223 Da) (29,538 Da) NUCLEOSOME ASSEMBLY PROTEIN 1- LIKE 1 NUCLEOSOME ASSEMBLY PROTEIN 1- LIKE 4 SIMILAR TO NAP1 SIMILAR TO NAP1 SIMILAR TO NAP1 Peptide sequence EQSELDQDLDDVEEVEEEETGEETK KYAVLYQPLFDK LDGLVETPTGYIESLPR YAVLYQPLFDK YAVLYQPLFDKR GIPEFWLTVFK NVDLLSDMVQEHDEPILK FYEEVHDLER KYAALYQPLFDK LDNVPHTPSSYIETLPK QVPNESFFNFFNPLK YAALYQPLFDK GIPEFWFTIFR AAATAEEPNPK Highest Number Xcorr Mowse observed 5.0 4.0 5.7 4.1 2.7 2.9 6.4 3.8 4.7 4.9 3.4 2.6 3.0 2.9 88 74 113 58 51 61 109 49 73 66 77 64 60 56 4 8 5 3 3 4 3 18 6 4 5 3 3 1 Removed 24% of protein IDs from Sequest assigned list Accuracy of the MSPlus approach: Expt 1: Manual analysis of 540 peptide assignments from Sample 1 (half accepted and half rejected, by panning method) 3.4% false positive assignments 8.0% false negative assignments Expt 2: Searching the randomized database: Sample 1: Soluble extract 16 SCX fr LC/MS/MS 4.0% false positives Sample 2: Soluble extract 11 SCX fr 10 gpf on LC/MS/MS 3.2% false positives Expt 3: Reproducibility in protein assignments between different analyses Sample 1 73 170 16 241 Repeat of sample 1 Reproducibility studies support accuracy of panning approach 89 Analyze 11 SCX fractions of sample 1 with gas phase fractionation 1514 Prefractionate proteins by gel filtration, then 16 SCX and gas phase fractionation 9 234 4787 Functional categories of represented proteins Hypothetical proteins (17%) Misc. (15%) Metabolism (11%) DNA replication, repair (5%) Transcription (6%) Translation and RNA metabolism (14%) Cytoskeletal Intracellular (6%) signaling (9%) Transport, trafficking Protein folding, Cell-cell communication (3%) (6%) Degradation (7%) Protein kinases: Cdk 2,5,6,7,9, MAPK, MAPKK, MAPKKK, PKA, PKC, Abl, A-Raf, Plk, CSK (~200,000 – 1,000,000 copies/cell) Cyclins: A2, B1, E1, K, T1 (~20,000 copies/cell) Txn factors: Sp1, ERF, ATF1,5, CREB1, CCAAT, EKLF (~10,000 copies/cell) ~20,000 proteins in a human proteome ~5000 proteins by shotgun cytoskeleton ~1800 proteins on 2D gels 109 copies/cell = mM metabolism 108 copies/cell = 0.1 mM 107 copies/cell = 10 µM ribosomes kinases cyclins 106 copies/cell = µM 105 copies/cell = 0.1 µM transcription factors 104 copies/cell = 10 nM 103 copies/cell = nM 102 or less copies/cell = 0.1 nM Functional changes based on coverage MEK/ERK dependent megakaryocyte diff’n in K562 cells Peptide representation Control PMA PMA/U0126 Affymetrix (n=2) Control PMA PMA/U0126 direct quantification of parent ion intensities 1. From parent ion masses of identified MSMS spectra, scan MS raw files for intensities 2. Fit peak to Gausian, identify resolved peaks 3. Map all MSMS onto peaks, eliminate peaks that are complex 4. If desired, group charge forms, pull together information from different SCX fractions or from gel filtrations fractions MS/MS taken here Correlation in intensities between peptides common to two runs with 5% vs 20% loading of the same sample. The left shows correlations on a linear scale; the right shows the same data on a log2 scale. Fraction 14 Fraction 8 Fraction 10 Fraction 14 Fraction 16 Fraction 14 Fraction 14 Fraction 8 Fraction 10 Fraction 12 Tubulin a1 – Comparison of gel filtration fractions Fraction 16 Fraction 16 Comparison of recoveries of peptides from different SCX fractions when chromatographed on different resins (J4 vs XP) 008 009 J4’s XP’s J4’s 61% XP’s 112 141 111 229 012 XP’s XP’s 36 98 XP’s 126 72% XP’s 40% 67% 018 J4’s 34 51 201 27% 66% 017 J4’s 68% J4’s 255 18% 016 26% 37% 014 XP’s 29 133 267 148 40 69 44% J4’s 65% 1748 XP’s 013 J4’s 29% J4’s 82 64 57% 44% 57 141 010 XP’s J4’s 13 23 98 70% 36% 52 69% Hydrogen exchange mass spectrometry: Regulated conformational mobilities in protein kinases Conformational changes in ERK2 upon phosphorylation (Canagarajah et al, 1997) Activation Lip Unphosphorylated ERK2 Phosphorylated ERK2 Overlay Rate of phosphoryl transfer increases 60,000x in ERK2 (Prowse et al. 2001, JBC) Red: increased HX Green: decreased HX Yellow: both increase/decrease Inconsistencies between hydrogen exchange and structure suggest altered backbone flexibility in the hinge region Total hydrogen exchange shows conserved patterns between p38 MAPK and ERK2 p38 ERK2 % deuterium incorporation in 5 h: > 75% 51-75% 25-50% < 25% Effects of phosphorylation on hydrogen exchange are not conserved between related protein kinases Could small molecule inhibitors be developed that target regions of regulated flexibility? phospho-p38 vs p38 ppERK vs ERK Activation of MKK1 leads to enhanced hydrogen exchange in the N terminal lobe MKK1 (WT vs DN4/S218E/S222D) PD184352 binds the region in MKK1 with greatest increase in flexibility. Mutations block binding, and enhance catalytic activity Delaney, Printen, Chen, Fauman and Dudley (2002) MCB Protein interactions regulate MAPK signaling Activators: Scaffolds/Insulators: MAP kinase kinases MEKK1 TAB1 Ste5, JIPs, MP1, AKAPs, KSR, b-arrestin Phosphatases: MKPs PTPs ERKs, JNKs p38 MAPKs, & orthologs Localization: nucleoporins PEA15, kinetochores centrosomes microtubules Allosteric: MKPs topoisomerase IIa Ste12 Substrates: Protein kinases, transcription factors RNA binding proteins, proteases, cytoskeletal regulators, ion channels, lipases, metabolic enzymes (specificity P-X-S/T-P) DEJL motif human Elk1 DEF motif PP P PP PP PP ETS DBD 1 428 PGKGRKPRDLELPL 310 Lee, T., et al. (2004) Molecular Cell 323 TLSPIAPRSPAKLSFQFPSS 383 389 400 MKK3b peptide binding to p38 MAPK –changes in HX b7- b8 aD-aE Elk1-DEJL peptide binding to ppERK2 –changes in HX b7- b8 aD-aE Elk1-DEF peptide binding to ppERK2 – changes in HX Overlap between solvent protected and hydrophobic regions D147 pT183 pY185 Structure of Canagarajah et al, 1997 Overlap between solvent protected and hydrophobic regions D147 pT183 pY185 M197 Yellow = hydrophobic residues Modeling Elk1-DEF peptide (PRpSPAKLSFQFP) into ppERK2 (-2) D147 Y191 P W190 R (+1) pY185 A187 S P V186 F Y231 pY185 L232 Y261 F P M197 L198 L262 ERK2 mutants disrupt GST-Elk binding, phosphorylation GST-Elk1 pulldowns pY185 Y231 M197 L232 L198 L235 Y261 L262 Conformational changes in ERK2 upon phosphorylation Activation Lip Unphosphorylated ERK2 (Canagarajah et al, 1997) Phosphorylated ERK2 Obstruction of the DEF binding site in ERK2 L182 pY185 Y231 Y185 F181 M197 Y231 L232 L232 L182 F181 L198 Y261 ppERK2 Y261 ERK2 Lab members: Functional proteomics by 2D-PAGE: Karine Bernard Yukihito Kabuyama Tim Lewis Betsy Litman Beth Roberts Rebecca Schweppe Kinase dynamics: Michelle Emrick Andy Hoofnagle Thomas Lee Shotgun Proteomics: Lauren Aveline Claire Haydon Karen Jonscher Kevin Pierce Will Old Steve Russell Tom Cheung Alex Mendoza Karen Meyer-Arendt Joel Sevinsky Collaborators: Natalie Ahn Lynn Chen UCHSC, Denver David Norris James Fitzpatrick UTSW, Dallas Betsy Goldsmith