Outline • What is proteomics? • Why study proteins? • Discuss proteomic tools and methods What is proteomics? Proteomics is the analysis of the protein complement to the genome Gene Genomics Transcript Protein Proteomics One organism will have radically different protein …while the genome is a rather constant entity, the “..the large-scale study proteins…while it often expression in different parts body, in is different proteome differs from of cell to of cellitsand is constantly viewed the “next is environmental much with morethe stages as ofthrough its life cycle andproteomics in different changing itsstep”, biochemical interactions complicated thanenvironment. genomics. conditions.” genome and the Wikipedia, http://en.wikipedia.org Proteomics is multidisciplinary Protein Biochemistry Biology Analytical Chemistry Proteomics Bioinformatics Molecular Biology Proteomics Research •Basic research: To understand the molecular mechanisms underlying life. •Applied research: Clinical testing for proteins associated with pathological states (e.g. cancer). Applications of Proteomics Medical Microbiology Drug Discovery Target ID Proteome Mining Differential Display Signal Transduction Protein Expression Profiling Disease Mechanisms Posttranslational Modifications Phosphorylation Proteolysis Proteomics Yeast Genomics Affinity Purified Protein Complexes Glycoyslation Yeast two-hybrid Functional Proteomics Structural Proteomics Mouse Knockouts Organelle Composition Subproteome Isolation Proteinprotein Interactions Co-precipitation Phage Display Protein Complexes For example: Hemoglobin Picks up oxygen in the lungs, travels through the blood, and delivers it to the cells. Hbβ Hbα O2 Hbα Hbβ hemoglobin Sickle cell disease is caused by a single amino acid change. Mutated Hbβ Normal Hbβ ATG GTG CAC CTG ACT CCT GAG GAG … M V H L T P E E… ATG GTG CAC CTG ACT CCT GTG GAG … M V H L T P V E… Summary – what is proteomics? •Involves the study of proteins •Proteomics is multidisciplinary •Proteomics is being applied to both basic and clinical research Why study proteins? What are PROTEINS? Proteins are large, complex molecules that serve diverse functional and structural roles within cells. Proteins do most of the work in the cell Enzyme Protease Degrades Protein Motion Actin Contracts Muscles Transport Hemoglobin Carries O2 Regulation Insulin Controls Blood Glucose Defense Antibody Fights Viruses Support Keratin Forms Hair and Nails Proteins are comprised of amino acid building blocks O Acid Amino acid 1 R1 C OH R Amino acid 2 R2 H C C CH H2N O H C C O + H OH Variable N OH H N H H Base R1 H O R2 O C C H C C H OH H2N N Peptide Bond H2O Dipeptide Each amino acid has unique chemical properties. basic acidic Histidine Aspartate Lysine Glutamate Arginine non-polar hydrophobic Valine Alanine Isoleucine Leucine Proline Methionine Phenylalanine polar hydrophilic Serine Glycine Tryptophan Cysteine Glutamine Tyrosine Asparagine Threonine Proteins are chains of amino acids. O C OH N H H Short chains of amino acids are called peptides. Proteins are polypeptide molecules that contain many peptide subunits. N H H Gene 3’ Nucleus Messenger Ribonucleic Acid (mRNA) Trp tRNA Ala tRNA Met Met Ribosome Large Subunit Met 5’ Amino Acidtransfer RNA tRNA Ala Empty tRNA Trp Empty tRNA A U G G C C U G G U A G Small Subunit Cytoplasm Ribonucleotides Codon 1 A U G = Methionine Codon 3 U G G = Tryptophan Codon 2 G C C Codon 4 U A G = Stop = Alanine A G C Translation is the synthesis of proteins in the cell. U Proteins have specific architecture http://www.path.cam.ac.uk/~mrc7/igs/mikeimages.html Proteins arrive at their final structure in an ordered fashion J. E. Wampler, 1996, http://bmbiris.bmb.uga.edu/wampler/tutorial/prot0.html Summary – why study proteins? •Biological workhorses that carry out most of the functions within the cell •Serve diverse functional and structural roles •Composed of amino acids that are covalently linked by peptide bonds •Synthesized during the translation process •Must fold correctly to perform their functions Proteomic tools and methods Proteomic tools to study proteins • Protein isolation • Protein separation • Protein identification Protein Isolation How are proteins isolated? • Mechanical Methods – grinding – break open cell – centrifugation – remove insoluble debris • Chemical Methods – detergent – breaks open cell compartments – reducing agent – breaks specific protein bonds – heat – break peptide bonds to “linearize” protein Protein isolation procedure Find a sample Pick it Grind sample in buffer Transfer to tube Centrifuge to remove Heat the sample insoluble material “pure” protein solution Recover supernatant Keep solution for gel analysis Protein X “pure” protein solution Isolated Protein X Summary – protein isolation •Proteins can be isolated from a variety of samples •Proteomics includes the use of both mechanical and chemical methods to isolate proteins •Opening cell or cellular compartments •Breaking bonds and “linearizing” proteins •Removal cell debris Protein Separation SDS-PAGE Why separate proteins? “PURE” Protein Solution Tube 1 Increased Complexity Decreased Protein ID Tube 2 Decreased Complexity Increased Protein ID How to separate proteins? Separating intact proteins is to take advantage of their diversity in physical properties, especially isoelectric point and molecular weight Methods of Protein Separation • Sodium Dodecyl Sulfate – Polyacrylamide Gel Electrophoresis (SDS-PAGE) • Isoelectric Focusing (IEF) SDS-PolyAcrylamide Gel Electrophoresis (SDS-PAGE) is a widely used technique to separate proteins in solution SDS-PAGE separates only by molecular weight • Molecular weight is mass one molecule • Dalton (Da) is a small unit of mass used to express atomic and molecular masses. PAGE is widely used in • • • • • Proteomics Biochemistry Forensics Genetics Molecular biology Polyacrylamide gels separate proteins and small pieces of DNA • Major components of polyacrylamide gels • Acrylamide – matrix material/ NEUROTOXIN • Bis-acrylamide - cross-linking agent/ NEUROTOXINS • TEMED - catalyst • Ammonium persulfate - free radical initiator Acrylamide (matrix material) NH2 O Bisacrylam ide (cross-linking agent) H N O H N O TEMED Polymerization N N (catalyst) Am monium persulfate (free radical initiator) SO4 Polyacrylamide (non-toxic) Polyacrylamide C ON2H Polyacrylamide (non-toxic) O C ON2H O NH C H2 Bis-acrylamide cross links NH O NH C H2 NH C ON2H O C ONH Sodium dodecyl sulfate - SDS The anionic detergent SDS unfolds or denatures proteins • Uniform linear shape • Uniform charge/mass ratio One-dimensional polyacrylamide gel electrophoresis (SDS-PAGE) Cathode (-) Anode (+) Standard Sample1 Sample2 During SDS-PAGE proteins separate according to their molecular weight Cathode (-) 150 kDa 100 kDa 75 kDa 50 kDa 37 kDa 25 kDa 20 kDa Anode (+) Standard Sample1 Sample2 Bromophenol Blue dye front Image of Real SDS-PAG Cathode 250 kiloDaltons 150 kDa 100 kDa 75 kDa 50 kDa 37 kDa 25 kDa 20 kDa Anode Separation of Protein X Cathode (-) 150 kDa 100 kDa 75 kDa 50 kDa 37 kDa Protein X 25 kDa 25 kDa 20 kDa 11 kDa Anode (+) Standard Sample1 Sample2 Bromophenol Blue dye front Two-dimensional gel electrophoresis (2-DGE) 1st dimension - isoelectric focusing 2nd dimension - SDS-PAGE Most widely used protein separation technique in proteomics Capable of resolving thousands of proteins from a complex sample (i.e. blood, organs, tissue…) 1st Dimension-Isoelectric Focusing Isoelectric focusing (IEF) is separation of proteins according to native charge. isoelectric point -pH at which net charge is zero 2-DGE pH gradient 3 1st dimension IEF 10 protein samples Neutral at pH 3 150 kDa 100 kDa 75 kDa 2nd dimension SDS-PAGE 50 kDa 37 kDa 25 kDa 20 kDa 11 kDa 2-DG kDa 3 100 4 5 6 7 mass 75 50 25 Arabidopsis developing leaf 8 9 10 pI 2-DGE 3 4 5 6 7 8 9 10 150 kDa 100 kDa 75 kDa 50 kDa 2nd dimension 37 kDa SDS-PAGE 25 kDa 20 kDa 11 kDa Protein X 25 kDa pI 5 1-DGE vs. 2-DGE 1-DGE (SDS-PAGE) • High reproduciblity • Quick/Easy • Separates solely based on size • Modest resolution, dependent on complexity of sample 2-DGE • Modest reproducibility • Slow/Demanding • Separates based on pI and size • High resolution, not dependent on complexity of sample Summary – protein separation •Protein separation takes advantage physical properties such as isoelectric point and molecular weight •SDS-PAGE is a widely used technique to separate proteins •1-DGE is a quick and easy method to separate protein by size only •2-DGE combines isoeletric focusing (IEF) and SDSPAGE to separate proteins by pI and size Protein identification mass spectrometry Peptide mass fingerprinting Measure peptide masses “Weigh” the peptides in a mass spectrometer protein digestion mass spectrometry intensity Make proteolytic peptide fragments - Digest the protein into peptides (using trypsin) intact protein x m/z mass Match peptide masses to protein or nucleotide sequence database - Compare the data to known proteins and look for a match 952.0984 1895.9057 1345.6342 899.8743 2794.9761 Protein ID Protein digestion We use the enzyme TRYPSIN to digest (cut) proteins into peptides – trypsin cuts after Lysine (K) and Arginine (R) Protein X ????????K?????R???????? How does mass spectrometry identify unknown proteins? Basics of mass spectrometry • determination of mass to charge ratio (m/z) • Mass spectrometer = very accurate weighing scales – third or fourth decimal place We then “weigh” these peptides with a Mass Spectrometer ????????K ?????R ???????? Mass Spectrometer We then “weigh” these peptides with a Mass Spectrometer ????????K 1106.55 Da ?????R 692.31 Da ???????? 1002.37Da Mass of peptides should be compared to theoretical masses of known peptides ????????K = 1106.55 Da ?????R = 692.31 Da ???????? = 1002.37Da Computation of theoretical masses of known peptides known Proteome = all protein sequences Digest Proteome with simulated Trypsin Computer Peptides • • • • • • • • • • • • • • • • • • • • • • WEGETMILK ADEMTYEK PLMEHGAK LMEHHH ASTEER DMGEYIILES EGEDMPAFY CYHGMEI EFPKLYSEK YSEPYSSIIR IESPLMIA AEFLYSR DLMILIYR METHIPEEK KISSMER PEPTIDEK MANYCQWS TYSMEDGHK YMEPSATFGHR GHLMEDFSAC HHFAASTR ALPMESS 1106.55 1105.23 1089.50 782.25 692.31 1056.92 1002.35 984.36 900.56 1102.34 864.35 600.21 864.97 795.36 513.21 456.23 792.15 678.46 995.46 896.35 564.88 469.12 Mass of peptides compared to theoretical masses of all peptides known, using a computer program. Computer Peptides ????????K = 1106.55 Da ?????R = 692.31 Da ???????? = 1002.37Da • • • • • • • • • • • • • • • • • • • • • • WEGETMILK ADEMTYEK PLMEHGAK LMEHHH ASTEER DMGEYIILES EGEDMPAFY CYHGMEI EFPKLYSEK YSEPYSSIIR IESPLMIA AEFLYSR DLMILIYR METHIPEEK KISSMER PEPTIDEK MANYCQWS TYSMEDGHK YMEPSATFGHR GHLMEDFSAC HHFAASTR ALPMESS 1106.55 1105.23 1089.50 782.25 692.31 1056.92 1002.35 984.36 900.56 1102.34 864.35 600.21 864.97 795.36 513.21 456.23 792.15 678.46 995.46 896.35 564.88 469.12 Mass of peptides matched to theoretical masses known peptides, using a computer program. Computer Peptides ????????K = 1106.55 Da ?????R = 692.31 Da ???????? = 1002.37Da • • • • • • • • • • • • • • • • • • • • • • WEGETMILK ADEMTYEK PLMEHGAK LMEHHH ASTEER DMGEYIILES EGEDMPAFY CYHGMEI EFPKLYSEK YSEPYSSIIR IESPLMIA AEFLYSR DLMILIYR METHIPEEK KISSMER PEPTIDEK MANYCQWS TYSMEDGHK YMEPSATFGHR GHLMEDFSAC HHFAASTR ALPMESS 1106.55 1105.23 1089.50 782.25 692.31 1056.92 1002.35 984.36 900.56 1102.34 864.35 600.21 864.97 795.36 513.21 456.23 1002.37 678.46 995.46 896.35 564.88 469.12 The unknown peptides have been identified ????????K = 1106.55 Da ?????R = 692.31 Da ???????? = 1002.37Da WEGETMILK ASTEER MANYCQWS Protein X has been identified ????????K?????R???????? ????????K?????R???????? ????????K?????R???????? WEGETMILK AFTEER MANYCQWS Summary – tools to study proteins? •Proteins are digested into peptides •Peptides are analyzed with a mass spectrometer •Match observed peptide masses to theoretical masses of all peptides in database •Assemble those peptide matches into a protein identification Concluding points about Proteomics -Proteomics is the analysis of all proteins -Interdisciplinary research -Essential to both basic and clinical research -Protein are the workhorses of the cell - Discovery research – drugs and diseases -Proteomics tools allow identification of proteins Questions