Biochemistry 462a - Proteins: Primary Sequence

advertisement
Biochemistry 462a - Proteins: Primary Sequence
Reading - Chapter 5
Practice problems - Chapter 5 - 2,5; Proteins extra problems
Levels of Structure
The function of a protein can only be understood in terms of its structure. The three dimensional
structures of many proteins have been determined and from these structures a few general
principles can be derived. Protein structure is discussed in terms of four levels of organization:




Primary Structure is the amino acid sequence of its polypeptide chain(s). Every protein has
a unique amino acid sequence.
Secondary Structure is the spatial arrangement of the polypeptide backbone, ignoring the
conformation of the sidechains.
Tertiary Structure is the three dimensional structure of the entire polypeptide.
Quaternary Structure refers to the three dimensional structure of proteins that are
composed of two or more polypeptide chains, called subunits.
Primary Structure
This is the primary structure of bovine
insulin, which is composed of two
polypeptide chains (A and B). The two
polypeptide chains are joined by two
interchain disulfide bonds - the A chain
also contains an intrachain disulfide bond.




Determining the amino acid sequence of a protein used to be a very laborious and timeconsuming process involving chemical and enzymatic degradation.
Today, the amino acid sequence of proteins is usually determined from the nucleotide
sequence of the gene - a relatively simple and rapid process.
The amino acid sequence of the same protein from many sources, e.g., cytochrome c, shows
that some amino acid residues are conserved among all the proteins, whereas others are not
conserved.
Such an analysis provides valuable information about amino acid residues that may be
essential for a proteins function.
1
The importance of amino acid side chains: Real Life Example - sickle cell hemoglobin






Hemoglobin is the oxygen transport protein in blood.
It is a tetramer containing two  and two  chains.
Hemoglobin exists in two states: an oxy form and a deoxy form.
Several hundred mutant hemoglobins are known to exist. In most, a single amino acid
replacement occurs in either the or  chain of normal Hb A.
Many of these changes cause no known effect, but several lead to pathologies associated with
abnormal O2 transport.
In sickle cell hemoglobin, HbS, there is a single amino acid replacement of a Val for Glu at
position 6 of the chain. This seemingly innocuous change places a hydrophobic sidechain
on the surface of the protein. In the deoxy conformation the Val sidechain of a  chain in
one Hb binds to the  chain of another Hb. This leads to polymer formation and precipitation
of the deoxy Hb. This leads to red cell lysis and anemia.
Amino Acid Composition



The amino acid composition is a fundamental
characteristic of any protein.
Hydrolysis of the protein in acid releases the
amino acids that are then quantitated using ion
exchange chromatography in an automated
amino acid analyzer.
The amino acid peaks are detected using
Ninhydrin, which reacts with the free amino
groups of amino acids to produce a purple
color.
2
Amino Acid Sequence
The amino acid of each protein is unique and determination of the amino acid sequence is an
important part of characterizing proteins. Today, most protein amino acid sequences are
deduced from the sequence of its gene, because sequencing DNA is much easier than
sequencing proteins.
However, determination of protein sequences is still an important tool in Biochemistry. We
use an automated process based on the Edman reaction and chromatographic techniques to
identify the PTH-derivative.
Although these reactions proceed to > 90%, eventually (about 25 cycles) it becomes difficult
to detect the newly released product. So a single Edman degradation is not able to determine
the entire sequence of a protein.
3
What is needed is a new amino terminal. This is accomplished by degrading the protein with a
proteolytic enzyme, such as trypsin, which generates a number of peptides that can be separated
and sequenced.

Trypsin cleaves peptide bonds at the carboxyl of Lys or Arg residues, as illustrated below.


Chymotrypsin cleaves peptide bonds at the carboxyl of Phe, Trp or Tyr residues.
Other proteases have different specificities, which allows one a variety of ways to fragment
the protein under investigation.
The problem, of course, is that once the proteolysis has been accomplished and the peptides
separated, you don't know how they are ordered in the original protein. Reestablishing the
order is the big problem in protein sequencing.

Mass Spectrometry

Recently mass spectrometry has become an important technique in peptide/protein
chemistry. Mass spectrometers consist of three basis parts




An ion source that creates charged molecules in the gas phase
a mass analyzer that uses a physical property, e.g., time-of-flight (TOF), to separate ions
a detector.
Two important methods are used to create protein ions:
4



In matrix-assisted laser desorption ionization (MALDI) ions are created by using a laser
to excite proteins in a crystalline matrix. MALDI is particularly suited for determining the
molecular weight of proteins, often to accuracies of a few parts per million. The spectrum
shown above illustrates the molecular masses of several peptides in a mixture.
In electrospray ionization (ESI) ions are created by applying a potential to a flowing liquid.
This causes the liquid to spray and protein ions to be created. This method can also be used
to measure molecular weight, but is most powerful when used in tandem MS/MS.
A tandem mass spectrometer combines two mass analyzers with a method to energetically
activate ions. In the first spectrometer a particular ion is isolated from all other ions that enter
the mass analyzer (as marked above), dissociated, and the m/z values of the dissociation
products determined in the second mass analyzer. The dissociation process causes covalent
bonds to fragment. In the case of peptide ions, fragmentation processes predominate at or
around the amide bond, creating a ladder of ions that is indicative of an amino acid sequence,
as illustrated below.
5
Sequence Homology

Once the amino acid sequence of a protein has been determined, there are powerful computer
programs that can be used to determine if the sequence is similar to other proteins. Such a
search might give the results shown below.
#1
MKRTYQPNRRKRSKVHGFRARMSTKNGRKVLARRRRKGRKVLSA
#2
MKRTWQPSKLKHARVHGFRARMATKNGRKVIKARRAKGRVRLSA
#3
MKRTYQPSRVKRNRKFGFRARMKTKGGRLILSRRRAKGRMKLTV
#4
MKRTFQPSILKRNRSHGFRTRMATKNGRYILSRRRAKLRTRLTV
#5
MKRTYQPSKQKRNRTHGFRARMATKNGRQVLNRRRAKGRKRLTV
#6
TKRTFQPNNRRRARKHGFRARMRTRAGRAILSARRGKNRAELSA
#7
SKRTFQPNNRRRAKTHGFRLRMRTRAGRAILANRRAKGRASLSA
#8
GKRTFQPNNRRRARVHGFRLRMRTRAGRSIVSDRRRKGRRTLTA
The degree of identity
between the sequences can be
used to construct a distance
matrix, which indicates how
closely related the different
sequences are. Here is one
for cytochrome c from a
variety of species.
6
Based on such a distance
matrix, one can then construct
a phylogenetic tree, as
illustrated here for
cytochrome c.
Genomics and Proteomics
There is a great of activity directed towards determining the complete sequence of the human
genome (genomics) and several other genomes are also being sequenced, e.g., yeast has been
done and the fruit fly Drosophila melanogaster will be finished soon. One the complete
sequence is finished, what to do with the data. One thing is to figure out what the proteins
encoded by the genome are and what they do (proteomics). In many cases we can deduce the
nature of the protein by homology to other proteins already sequenced, but in several cases
(maybe >30%), we have no clue. We can use biotechnology techniques to produce the protein,
which can then be purified and studied in order to try to deduce its function. One important
approach is to determine its three dimensional structure, which may give a clue to its function.
The future of protein biochemistry is indeed exciting!
7
Download