CH 908: MASS SPECTROMETRY LECTURE 11 Proteomics Outline • • • • • • • • • Definitions and concepts Top-down vs. Bottom-up proteomics Importance of separation in proteomics 2D Gel Electrophoresis MS analysis: Peptide Mass Fingerprinting For MS/MS sequencing: see lectures 8-10 Scoring algorithms (MOWSE, Mascot) Database searching: importance of constraints Importance of high resolution and high mass accuracy for protein ID • (Post-translational modifications) • (Phosphorylation) GENOME AND PROTEOME • Genome is the complete genetic information (either DNA or, in some viruses, RNA) of an organism • The proteome is the entire set of proteins expressed by a genome of a cell, tissue or organism. More specifically, it is the set of proteins expressed in a given type of cells at a given time under defined conditions. A CONCEPT OF PROTEOMICS GENOME TRANSCRIPTOME Gene PROTEOME Protein mRNA Post-Translational Modification Exon 1 Exon 1 2) Alternative Exon 3 Splicing Protein Isoform 12 Exon 2 1) Transcription Exon 2 Translation Exon 1 Exon 3 Active Protein Isoforms Protein Isoform 13 A B C… 12A 12B 12C … 13A 13B Protein-Protein Interactions 13C Protein-DNA Interactions … Protein-Small Molecule Interactions… INTERACTOME PROTEOMICS • Proteomics – study of proteome using the technologies of large-scale protein separation and identification. • The term “proteomics” was coined in 1994 by Marc Wilkins who defined it as "the study of proteins, how they're modified, when and where they're expressed, how they're involved in metabolic pathways and how they interact with one another." WHY STUDY PROTEINS? • Proteins are mediators of the cell functions • Differences in protein expression or degree of PTM may denote a disease • Proteins are drug/therapeutic targets • => more efficient drug discovery CASE STUDY – GLEEVEC/GLIVEC • In the 1980s, it was discovered that a single defective protein (a tyrosine kinase enzyme) could cause chronic myeloid leukemia (CML) • By the mid-to-late 1990s, a powerful, specific inhibitor of the abnormal cell pathway was developed from a promising but weak-acting inhibitor class of molecules • Blocking the pathway prevents the uncontrolled proliferation of white blood cells • The drug is Glivec (imatinib; Novartis) CHALLENGES: GENOMICS VS. PROTEOMICS Genome (DNA) Proteome (proteins) • Static (no change with time) • Dynamic (highly variable with time; many proteomes • Can be amplified (PCR) for one genome) • Little sample complexity (4 base • Cannot be amplified pairs, very similar, same order of • High sample complexity concentration) (wide variety of physical • Good solubility and chemical properties; concentrations can differ by 9 orders of magnitude) • Various solubility; some proteins are insoluble in water OF9-10 PROTEOMIC SAMPLES •COMPLEXITY Concentration range: orders of magnitude Example: human plasma: albumin ~ 50% (~ 30-40 mg/ml); cytokines and growth factors, distinct clonal immunoglobulins (millions, single copies, <<pg/ml) • Protein isoforms, post-translational modifications - >250 different kinds of covalent PTM’s; - examples: phosphorylation, glycosylation, etc.; - a protein may have multiple PTM sites and various degrees of modification => different proteins by phys.-chem. properties COMPLEXITY OF PROTEOMIC SAMPLES: HUMAN PLASMA Anderson, N. L. (2002) Mol. Cell. Proteomics 1: 845-867 COMPLEXITY OF PROTEOMIC SAMPLES: PROTEIN HETEROGENEITY (1) http://www.cs.helsinki.fi/bioinformatiikka/mbi/courses/06-07/proteomics/slides/lecture1.ppt PROTEOMICS GOALS • Identification of all proteins in a proteome • Search for new, hypothetical or predicted proteins • Analysis of differential expression between 2,3,... different conditions (protein up- or downregulation) • Identification of post-translational modifications • Characterization of proteins by function, pathway, cellular location, etc. • Study of protein-protein interactions PROTEOMICS: VARIOUS APPROACHES “Bottom-up” Separate, digest Proteins Peptides Digest, separate Separate MS, MS/MS Database search, analyze (“Shotgun proteomics”) “Top-down” Proteins Separate, MS, MS/MS, analyze, (database search) Mass-spectrometer - FT ICR Both strategies are complementary Mass-spectrometers – IT/TQ/TOF/ICR... Proteins identified “TOP-DOWN • Analysis of intact proteinsPROTEOMICS” from complex biological systems • FT-ICR MS (Fourier-Transform Ion Cyclotron Resonance Mass Spectrometry) is the technique of choice • MS/MS: ECD (electron capture dissociation) and ETD (electron transfer dissociation) typically provide more uniform dissociation than conventional CID (collisionally induced dissociation), while preserving the labile modifications “TOP-DOWN” APPROACH Williams, E. R. et.al. http://www.cchem.berkeley.edu/erwgrp/science_old.html “TOP-DOWN” APPROACH: A DESCENDANT OF HIGH RESOLUTION TANDEM MASS SPECTROMETRY OF LARGE BIOMOLECULES (EARLY 90IES) Kelleher, N. L., C. A. Costello, et al. (1995). "Thiaminase I (42 Kda) Heterogeneity, Sequence Refinement, and Active Site Location From High-Resolution Tandem Mass Spectrometry." Journal of the American Society for Mass Spectrometry 6(10): 981-984. TOP-DOWN APPROACH: THE IMPORTANCE OF HIGH RESOLVING POWER ∆m = 71 Da, Resolving power = M/ ∆m ~ 600 is necessary Resolving power up to 3 000 000 is achievable with FTICR MS Kelleher, N. L., C. A. Costello, et al. (1995). "Thiaminase I (42 Kda) Heterogeneity, Sequence Refinement, and Active Site Location From High-Resolution Tandem Mass Spectrometry." Journal of the American Society for Mass Spectrometry 6(10): 981-984. TOP-DOWN PROTEOMICS: PRO’S AND CON’S Advantages • 100% protein sequence coverage is possible => identification of protein isoforms, proteolytic processing events, and PTMs; • De-novo sequencing; • Big protein masses are more "information rich" thus improving the quality of the information and decreasing false positives it's almost impossible to mis-assign with top-down; • big proteins often lose signal peptides, methionines, or are otherwise proteolytically modified after translation so that they don't correlate to genome sequence databases (The databases need updating to reflect this.) • Localization of non-covalently bound ligands is possible TOP-DOWN PROTEOMICS: PRO’S AND CON’S Disadvantages • • • • Limited sensitivity and throughput Pure samples are required Insoluble proteins cannot be analyzed Expensive instrumentation, expert level users “BOTTOM-UP” APPROACH • Bottom-up proteomics is a common method to identify proteins and characterize their amino acid sequences and PTMs by enzymatic digestion of proteins prior to analysis by mass spectrometry • The proteins may first be purified (e.g., GE) resulting in one or a few proteins in each enzymatic digest. • Alternatively, the crude protein extract is digested directly, followed by one or more dimensions of separation of the peptides by liquid chromatography coupled to mass spectrometry (“shotgun proteomics”) “BOTTOM-UP” PROTEOMICS FLOWCHART M. L. Fournier, J. M. Gilmore, S. A. Martin-Brown, and M.P. Washburn Multidimensional Separations-Based Shotgun Proteomics Chem. Rev. 2007, 107, 3654-3686 BOTTOM-UP PROTEOMICS: PRO’S AND CON’S Advantages • Less sophisticated instrumentation and expertise • High throughput • More info about proteins with “extreme” phys.chem. properties (hydrophobic, Hi/Low MW, acidic/basic) Disadvantages • Confidence in protein ID strongly depends on restriction criteria (subjective; potential bias) • Since protein ID is often done by 1-2 peptides, PTM and isoform information is often lost SEPARATION IN PROTEOMICS It is impossible to resolve all species in a proteomics sample using only one separation method Multidimensional separation - two or more independent (“orthogonal”) separation techniques coupled together for the analysis of a single sample. Separation method Separation by: Reversed phase Hydrophobicity Ion exchange, IsoElectroFocusing (IEF) Size exclusion, SDS Gel Electrophoresis Net charge, Isoelectric point Affinity chromatography Specific functional groups Size, molecular weight Visualization of peak capacity in both 1D and 2D separations Identical samples of six species are separated by 1D and 2D techniques. Although the column shown for 1D separation has a theoretical peak capacity of eight (indicated by the boxes below the column), the 1D technique is able to clearly resolve only four distinct peaks. The addition of a second chromatographic dimension greatly improves the theoretical peak capacity (8 8 = 64) as shown in the boxes below the columns. The second column is able to improve the separation of overlapped peaks so that clearly resolved peaks from all six species can be clearly identified. Published in: Marjorie L. Fournier; Joshua M. Gilmore; Skylar A. Martin-Brown; Michael P. Washburn; Chem. Rev. 2007, 107, 3654-3686. DOI: 10.1021/cr068279a Copyright © 2007 American Chemical Society PROTEOMICS CLASSIC: 2D GEL ELECTROPHORESIS Immobilized pH gradient SDS PAGE 2D gel image of brain proteins: about 3000 spots after Coomassie staining Proteomics in brain research: potentials and limitations Gert Lubec, Kurt Krapfenbauer and Michael Fountoulakis 4.5 Progress in Neurobiology, Volume 69, Issue 3, February 2003, Pages 193-211 pI STEPS IN 2D GE • • • • • • • • • Cell disruption Protein solubilization Prefractionation (optional, yet recommended) Isoelectric focusing (IEF) Equilibrium of IEF strip SDS PAGE Detection of protein spots Image analysis and spot picking Protein spot identification PROTEIN EXTRACTION • Detergents: solubilize membrane proteins-separation from lipids • Reductants: Reduce S-S bonds • Denaturing agents: Disrupt protein-protein interactions-unfold proteins • Enzymes: Digest contaminating molecules (nucleic acids etc) • Protease inhibitors http://cbt20.files.wordpress.com/2009/04/proteomics-seminar-bio-rad-2.pdf PREPARATIVE IEF The protein mixture is injected into the focusing chamber Proteins are focused as in standard IEF Vacuum assisted aspiration into sample tubes The pH gradient is achieved with soluble ampholytes Large amount of proteins (up to 3g protein) http://cbt20.files.wordpress.com/2009/04/proteomics-seminar-bio-rad-2.pdf GEL STAINS - SUMMARY Stain Sensitivity (ng/spot) Advantages Coomassie R-250 50-100 Simple, fast, consistent Colloidal Coomassie 5-10 Simple, fast Silver stain 1-4 Very sensitive, awkward Copper stain 5-15 Reversible, 1 reagent negative stain Zinc stain 5-15 Reversible, simple, fast high contrast neg. stain SYPRO ruby 1-10 Very sensitive, fluorescent Ruby red Silver Coomassie blue 1) David Wishart, University of Alberta, Edmonton, AB; 2) http://cbt20.files.wordpress.com/2009/04/proteomics-seminar-bio-rad-2.pdf 29 PROTEIN DIGESTION Why digest the protein? • Peptides are easier to work with compared to peptides (smaller, easier to solubilize, etc.) • Peptide fragments of between 6 – 20 amino acids are ideal for MS analysis and database comparisons (m/z 700 to 2000 – ideal mass range for most mass analyzers) • Proteins are cleaved at certain specific amino acid residues (a constrain for database searching) Trypsin: - Cleaves at basic arginine (R) or lysine (K) amino acid residues => each proteolytic fragment will contains a basic residue, a site of a proton attraction, and thus is eminently suitable for positive ionisation MS; - Generates peptide fragments of optimal length for MS; - Robust and easy to use; good activity in gel, solution and when immobilized on column or beads PEPTIDE LENGTH AND NUMBER OF PEPTIDES GENERATED DEPENDING ON ENZYME USED FOR DIGESTION Other enzymes with more or less specific cleavage: Advantages of a new proteomic approach that uses accurate mass measurements, LC retention time, isoelectric point and dual enzymatic digestion. Petritis K. et. al., Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352; ASMS'2007 poster presentation http://www.chem.agilent.com/Library/posters/Public/Petritis_ASMS_2007.pdf PROTEIN IDENTIFICATION BY MASS SPECTROMETRY: PEPTIDE MASS FINGERPRINTING (PMF) % Intensity 90 80 70 60 50 40 30 20 0650 1220 10 1790 Mass (m/z) 2360 2100.6 2930 3017.3643 2847.3223 1700.2 2744.2598 1299.8 2408.0854 2466.1292 2522.1731 70 2265.1555 955.2725 2002.8455 1742.8866 1778.0724 1787.7205 899.4 1584.7921 1394.7437 1446.8040 1507.7018 1265.6365 1299.6415 10 1457.7261 1341.6743 499.0 1083.5385 1100.5907 0 666.3641 710.3889 736.4528 750.3957 801.5159 823.5144 862.4270 870.4494 900.4011 963.4941 989.5035 % Intensity 100 PMF: MALDI-TOF MS SPECTRA OF TRYPTIC DIGESTS TWO PROTEINS 1168.3344 1.5E+4 90 80 1633.2021 60 50 40 30 20 779.2675 1470.3254 Mass (m/z) 2501.0 1770 100 3500 PMF: PROTEIN IDENTIFICATION Identification is possible for single proteins or mixtures of a small number of proteins (e.g., in-gel digests) PEPTIDE MASS FINGERPRINTING: DATABASE SEARCH RESULTS Sequence coverage map PEPTIDE SEQUENCING BY MS/MS Peptide backbone Various types of fragment ions MS AND MS/MS FOR PROTEIN IDENTIFICATION 100 90 3.6E+4 1570.6824 MS spectrum MS/MS spectrum 80 % Intensity 70 60 50 1713.7466 40 30 783.2330 10 1176.5958 1106.5554 0 701 1877.7767 1246.6296 20 1359 1552.6726 1728.6266 2211.0630 2017 2675 Mass (m/z) 3333 3991 PEPTIDE SEQUENCE FROM MS/MS SPECTRUM: EGVYVHPV [Abs . Int. * 1000] a b y 70 E V 363.046 Va 3 V G H Y 352.069 Y y 3 V H V H P Y E 65 YVH y 5b6 60 VH y 4b6 55 391.055 b 3 50 526.081 a 4 770.388 y 7 VY y 6b4 H 45 554.090 b 4 40 790.271 b 6 GVY y 7b4 214.972 y 2 35 762.266 a 6 291.945 b 2 30 25 234.952 b 1 GVYV y 7b5 20 653.172 b 5 VHP y 4b7 15 264.966 a 2 10 5 V 1004.481 y 8 614.446 y 5 625.165 a 5 451.120 y 4 887.378 b 7 Y 0 0 200 400 600 m /z 800 1000 DATABASE SEARCHING… => Identified proteins Name World-Wide Web Publicly Available Bioinformatic Tools for Proteomics Resources NCBI SwissProt address National resource for molecular biology information (). A comprehensive, non-identical protein database maintained by NCBI for use with their search tools BLAST and Entrez A comprehensive, annotated, non-identical protein sequence database maintained by Swiss Institute of Bioinformatics ExPASy Proteomics server with a variety of tools Swiss2DPAGE 2-D gel database of various organisms ProFound Protein chemistry and mass spectrometry resource (PROWL) Protein Prospector Peptide mass search tools from UCSF MASCOT Probability-based search algorithm for peptide and protein identification using MS data SEQUEST Search algorithm for ESI-MS/MS data http://www.ncbi.nlm.nih.g ov/ http://www.expasy.org/sp rot http://www.expasy.org/ http://www.expasy.ch/ch 2d/ http://prowl.rockefeller.e du/ http://prospector.ucsf.ed u/prospector/mshome.ht m http://www.matrixscience .com http://fields.scripps.edu/s equest DATABASE SEARCH: SCORING • MOWSE algorithm (used in Protein Prospector, Mascot) • There is no threshold for a reliable MOWSE score • It is mainly for putting proteins in order of whether they are likely to be correct => the goal is to separate real matches from random; • Mascot uses a probability based implementation of the MOWSE algorithm The MOWSE score is described in the paper: Pappin et al, Current Biology, 1993, Vol 3, No 6, pp 327-332 http://prospector.ucsf.edu/prospector/mshome.htm http://www.matrixscience.com/help/scoring_help.html DATABASE SEARCH: MASCOT Mascot score = –10*LOG10(P), where P is the absolute probability that the observed match is a random event - A probability of 10-20 thus becomes a score of 200; • The higher P, the less chance that a match is random, therefore, the more chance that it is real; • A commonly accepted threshold is that an event is significant if it would be expected to occur at random with a frequency of less than 5% - Mascot report: "Scores greater than ... are significant (p<0.05)". • => The size of the database searched becomes important! • Constraint parameters will decrease the size of the database thus increasing the level of significance DIFFERENCE BETWEEN A SIGNIFICANT MATCH AND A CORRECT MATCH (1) Scores greater than 67 are significant (p<0.05) => the highest score is highly significant (3) (2) Mass tolerance increased from ±0.1 Da to ±1.0 Da => the best match is still correct, but it is barely significant Mass tolerance increased to ±2.0 Da => he correct match remains at the top of the list, but because the score is << the significance threshold, there could be no confidence in this match CONFIDENCE IN DATABASE IDENTIFICATION: AN IMPORTANCE OF SEARCH CONSTRAINTS Constraints SEARCH CONSTRAINTS “Classic” Proposed • • • • • • Retention time (RP HPLC) • pI (peptide) • ... MW, mass accuracy pI (protein) Enzyme (specificity) Species (taxonomy) Instrument (=> type of ions in MS/MS) IMPORTANCE OF MASS ACCURACY: ACCURATE MASS TAGS (AMT) • Richard Smith, Pacific-Northwest National Laboratory: “Utility of Accurate Mass Tags for Proteome-Wide Protein Identification” T. P. Conrads, G.A. Anderson, T.D. Veenstra, L. Paša-Tolić, and R.D. Smith Anal. Chem., 2000, 72 (14), pp 3349–3354 • Accurate mass tag (AMT) - mass of a single peptide measured with such a high mass accuracy that it allowes unambiguous protein identification • Analysis of all the predicted proteins and tryptic peptides generated from the theoretical ORFs in Saccharomyces cerevisiae (yeast; 6117 proteins) and Caenorhabditis elegans (19 098 proteins) to determine the mass accuracy needed for unambiguous protein identification on a proteome-wide basis • The results indicate that the MMA required is presently achievable using FTICR mass spectrometry IMPORTANCE OF MASS ACCURACY (CONTINUED) • • • Low ppm (i.e.,~1 ppm) level measurements have practical utility for analysis of small proteomes; Up to 85% of the peptides predicted from these organisms can function as AMTs at sub-ppm mass accuracy levels attainable using FT ICR MS; Additional constraints should enable even more complex proteome to be studied at more modest mass measurement accuracies 6117 potential proteins 19 098 potential proteins • As the number of potential proteins increases, the identification of proteins solely on the basis of molecular mass becomes more demanding IMPORTANCE OF MASS ACCURACY AND RESOLVING POWER Calculated mass spectra for a mixture of all possible 10-mer polypeptides at mass resolutions that correspond to resolutions (R) (and mass measurement accuracies) of 103 (1000 ppm) (A), 104 (100 ppm) (B), 105 (10 ppm) (C), and 107 (0.1 ppm) (D). The number of peptides is the number unresolved and that would not be distinguishable at a corresponding level resolution (i.e., 1 ppm corresponds to a resolution of 106, although it should be noted that the attainment of the necessary level of MMA is generally much more important than resolution for the use of the AMT concept). Percent unique tryptic fragments (potential accurate mass tags) as a function of tryptic fragment mass at four different levels of mass measurement accuracy for the predicted proteins of yeast (A) and C. elegans (B). ADDITIONAL CONSTRAINT: PHOSPHOPEPTIDES Distinctively large mass defect of phosphorus relative to H, C, and O (~0.3 Da) has the net result of off-setting the average mass of phosphopeptides to slightly lower mass than unmodified peptides of the same nominal molecular weight, often marking a peptide as phosphorylated simply on the basis of its mass. Predicted percent unique yeast phosphopeptide fragments (potential accurate mass tags) as a function of phosphopeptide fragment mass at four different levels of mass measurement accuracy. Self Assessment Questions • What’s the difference between ‘top-down’ and ‘bottom-up’ proteomics? Which works better? Why? • Chromatography is critical in proteomics, what types of chromatography are most used? • What’s the dynamic range of proteins in a cell? • How does accurate mass help in a proteomic experiment? Percentage of unique peptides as a function to peptide MW and different theoretical conditions of mass accuracy (ppm), retention time (RT), isoelectric point (pI) (ppm), and “in-solution fragmentation (ISF) •ISF: LysC for the 1st digestion and a combination of trypsin and chymotrypsin for the 2nd digestion. •ISF in combination with accuracies: 5 ppm mass, +/- 0.05% RT prediction and +/- 0.5 pH units IEF provides enough specificity in order for the peptides with MW>1000 Da to be identified with high confidence. •At least >91% of the peptides with MW > 1000 Da are unique while >99% of the peptides with MW > 1500 are unique. POST-TRANSLATION MODIFICATIONS (PTM’S): SOME FACTS • >250 different kinds of covalent PTM’s are known to date; • Covalent PTM: derivatization of individual amino acid residues; • May occur at any stage of protein biosynthesis, but only after the formation of the aminoacyl-tRNA; • Serve for purposes of regulation of all biochemical processes in cells; • Procaryotes and eucaryotes use different PTM’s as regulation mechanisms REGULATORY PTM’S • Should be reversible by nature Main regulatory types known to date: • • • • • • • • Phosphorylation; Acetylation; Methylation; SS/SH interconversions; Glycosylation Ubiquitination Adenylation; Uridylation CHEMISTRY OF PHOSPHORYLATION • Hydroxyaminoacids - main acceptors of phosphate groups; • ~98% of phosphorylated amino acids in proteins are Ser-P; • >99% -- Ser-P and Thr-P; • Tyr-P content is <0.01% (though is very important) PHOSPHOPROTEINS: A CHALLENGE FOR MS ANALYSIS • Low abundance (partially phosphorylated proteins; phosphorylation stoichiometry range: 4.5-100%) • Ionization efficiency for phosphopeptides is ~1/10 of unmodified peptides (phosphate is in anionic form => poor proton acceptor) • => Low sensitivity • Multiply phosphorylated peptides carry a substantial (-) charge =>3 phosphates - no signal ! • Poor fragmentation (main fragmentation channel – loss of phosphate, not informative about the sequence): - abundant [MH-H3PO4]+ peak for Ser-P and Thr-P; - abundant [MH-HPO3]+ peak for Tyr-P; Two-dimensional (2D) peptide separation methods for shotgun proteomics analysis (a) This method couples two liquid chromatography separations. In the first dimension, peptides are separated on the basis of charge or affinity and in the second dimension, on the basis of hydrophobicity. The two liquid chromatography separation methods can be coupled in offline or online modes. The online modes can be performed by MudPIT or a column-switching system. (b) This method couples a first separation based on the isoelectric point and a second separation by liquid chromatography based on hydrophobicity. In the first dimension peptides can be separated by isoelectric focusing through electrophoresis on immobilized pH gradient gel strips (IPG-IEF) or in solution, by capillary electrophoresis (CIEF) or free-flow electrophoresis (FFE-IEF). The IPG-IEF or FFE-IEF and CIEF systems are respectively coupled with the liquid chromatography method in off-line and on-line modes. (c) A third separation method couples liquid chromatography separation based on hydrophobicity with capillary zone electrophoresis. The separation systems are interfaced in an online mode. Published in: Marjorie L. Fournier; Joshua M. Gilmore; Skylar A. Martin-Brown; Michael P. Washburn; Chem. Rev. 2007, 107, 3654-3686. DOI: 10.1021/cr068279a Copyright © 2007 American Chemical Society “DIVIDE AND CONQUER“! • Separation is a key • All data will go through database search, therefore... • => Gather as much information about a sample as possible: - Sample: species, organelle, fraction; - Proteins: MW (M/z), the higher mass accuracy, the better; pI; - Peptides: MW (m/z), the higher mass accuracy, the better; retention time (new algorythms); pI; enzyme used for digestion; specific functional groups or amino acids (affinity separation); • Restrictions on database search (constraints) will limit choices and increase confidence in identification! (more discussed later) DIFFERENTIAL SOLUBILIZATION Protein sample Extraction with 40mM Tris Base supernatant Fraction 1 pellet Extraction with 8M Urea, 4% CHAPS supernatant Fraction 2 pellet Extraction with 5M Urea, 2M Thiourea 2% CHAPS, 2% SB3 supernatant Fraction 3 David Wishart, University of Alberta, Edmonton, AB pellet Extract with SDS Fraction 4 SUBCELLULAR FRACTIONATION Human mitochondrial proteins David Wishart, University of Alberta, Edmonton, AB Human nuclear proteins ISOELECTRIC FOCUSING • Separation of proteins by isoelectric point (pI) in pH gradient strips (ampholytes, immobilines) • pI is a pH in which net charge of protein is zero (defined by pKa of charged side chain groups of AA) • Protein moves in pH gradient as long as its net charge in a given point of pH gradient becomes 0 • Carried out in Immobiline gelsor in (semi-) preparative IEF devices • Advantages: proteins in native state; proteins concentrated; pI data; preparative IEF is possible (high protein loads) COMPLEXITY OF PROTEOMIC SAMPLES : PROREIN HETEROGENEITY (1) g-enolase A B Partial 2D-gel images showing g-enolase from human brain. The protein is represented by one spot when IEF was performed on pH 310 non-linear IPG strips (A), and by six spots when IEF was performed on pH 4-7 strips (B). Better resolution – more information! Proteomics in brain research: potentials and limitations Gert Lubec, Kurt Krapfenbauer and Michael Fountoulakis Progress in Neurobiology, Volume 69, Issue 3, February 2003, Pages 193-211 COMPLEXITY OF PROTEOMIC SAMPLES : PROREIN HETEROGENEITY (2) Glial fibrillary acidic protein (GFAP) is considered to be a specific marker in Alzheimer’s disease. The determination of GFAP expression is confounded by the many isoforms and posttranslationally modified forms observed in brain Figure: Two-dimensional gel analysis of the thalamus brain region from a patient with Alzheimer’s disease. The spots indicated represent GFAP. This protein is usually represented by more than 50 spots. Proteomics in brain research: potentials and limitations Gert Lubec, Kurt Krapfenbauer and Michael Fountoulakis Progress in Neurobiology, Volume 69, Issue 3, February 2003, Pages 193-211 2D GE Advantages Disadvantages • Limited pI range (2 to 11, • Provides a hard-copy record of but 4-8 routinely) separation; a map of intact proteins which reflects changes • Proteins >150 kD are not in protein expression, isoforms seen in 2D gels or PTM • Membrane (hydrophobic) • Separation of up to 9000 proteins are different proteins (~2000 underrepresented (>30% of routinely) all proteins) • Able to resolve proteins with pI • Only detects high around 0.001 pH units abundance proteins (top 30% typically) • Detect and quantify <1ng of protein per spot • Multiple proteins in one spot (~ 30% of all spots) • Highly reproducible • Time consuming • Provides accurate info for database searching (Mw, pI and PTMs) David Wishart, University of Alberta, Edmonton, AB; • Inexpensive