A highly abbreviated introduction to proteomics A typical shotgun proteomics experiment Collect tens of thousands of MS/MS spectra Can identify >1,000 proteins from cell lysate Orbi video: http://apps.thermoscientific.com/media/SID/LSMS/Video/webinar/orbitrap_elite/animation/ Shotgun proteomics identifies proteins from the fragmentation mass spectra of their constituent peptides Peptide fragmentation Actual peptide tandem (MS/MS) mass spectrum Idealized peptide tandem (MS/MS) mass spectrum from database Idealized peptide tandem (MS/MS) mass spectrum with PTM (phosphoserine) Marcotte (2007) Nature Biotechnology 25:755-757 b&y ions One common strategy for relative quantification = using isotopically labeled samples (e.g. 15N vs. 14N, 13C vs. 12C, etc.) SILAC = stable isotope labeling with amino acids in cell culture iCAT = isotope tags on cysteines iTRAQ = isobaric labels on cysteines (same mass, different isotopes) AQUA = absolute quantification by spiking in isotopically shifted peptide standards for proteins of interest Mallick & Kuster (2010) Nature Biotechnology 28:695-709 Mass spectrometry strategies for measuring absolute protein abundances for 100’s to 1000’s of proteins adapted from Vogel & Marcotte Nature Biotechnology 2009 27, 825-6 & the current state-of-the-art … Each 100-200K peptides, from ~10,000 proteins spanning ~7 orders of magnitude in abundance A highly abbreviated introduction to large-scale protein interaction screens X-ray structure of ATP synthase Schematic version Network representation a b d g b2 e a Total set = protein complex Sum of direct + indirect interactions c12 High-throughput yeast two-hybrid DBD Bait DNA binding domain + Prey Act Transcription activation domain Core transcription machinery transcription operator or upstream activating sequence Reporter gene High-throughput yeast two-hybrid Haploid yeast cells expressing activation domainprey fusion proteins Diploid yeast probed with DNA-binding domainPcf11 bait fusion protein High-throughput complex mapping by mass spectrometry Tag Bait Affinity column SDSpage protein 1 protein 2 protein 3 Trypsin digest, protein 4 identify peptides by protein 5 mass spectrometry protein 6 493 bait proteins 3617 “interactions” A variant: tandem affinity purification (TAP) Tag1 Tag2 Bait Affinity column2 Affinity column1 SDSpage + protease protein 1 protein 2 protein 3 protein 4 protein 5 protein 6 Trypsin digest, identify peptides by mass spectrometry Affinity column1 Estimating accuracy with a well-determined reference set of interactions Where we were, more or less, until recently in terms of PPI maps The current state-of-the-art in animal PPI maps ~3,500 affinity purification experiments ~11K interactions / ~2.3K proteins spans 556 complexes Still daunting for the human proteome Guruharsha et al. (2011) Cell 147, 690–703 Finding stable protein assemblies by native separations and quantitative mass spec. >2,000 biochemical fractions, including replicates >9,000 hours mass spec machine time Havugimana, Hart, et al., Cell (2012) The profiles cover > ½ the experimentally verified proteome & proteins within the same stable complexes co-elute Havugimana, Hart, et al., Cell (2012) Turning separations into complexes 1) One separation, #13 of many 4) Inferred complexes ~5600 proteins ... ~120 fractions 59 60 61 62 63 64 exoc1 exoc2 exoc3 exoc4 exoc5 exoc6 exoc7 exoc8 Co-separation of the exocyst complex 2) Pairwise protein correlations 2b) External data •Co-expression, shared protein domains, much more (HumanNet) •Other AP-MS datasets (Guruharsha 2011, Malovannaya 2011) Machine learning (SVM, Ensemble methods) high correlation >> more likely in complex Cluster 3) Inferred interactions Guiding and testing the reconstruction with known complexes Havugimana, Hart, et al., Cell (2012) A reference map of human protein complexes 13,998 high-confidence physical interactions / 3,011 proteins Defines >600 complexes: >100 heterodimers, >500 with ≥3 components Havugimana, Hart, et al., Cell (2012) In yeast, phenotypes reflect biological modules. e.g., lethality is tied not to the protein, but to the molecular machine small nucleolar ribonucleoprotein complex SAGA transcription factor/ chromatin remodeling complex TAFIID complex protein phosphatase 2A complex Essential gene Nonessential gene Hart, Lee, & Marcotte, BMC Bioinformatics 8:236 (2007) The human protein complexes are also strongly enriched for genes linked to the same diseases and phenotypes Havugimana, Hart, et al., Cell (2012) The complexes are strongly enriched for genes linked to the same diseases, e.g., as for Cornelia de Lange Syndrome prweb.com Now confirmed by Deardorff et al., Am. J. Hum. Genet. 90, 1014–1027 Dermatology Our current state of the art animal complex map Cuihong Wan Blake Borgeson w/ Andrew Emili’s lab Our current state Extending of the art theanimal map complex map Now 7 animals, >65 separations, nearly 7,000 mass spec experiments >3,000 fractions >3,500 fractions ~9,000 proteins ~12,000 proteins Cuihong Wan Blake Borgeson w/ Andrew Emili’s lab