Bioch301.1-2 - Center for Structural Biology

advertisement
Jan. 8-10, 2003
Biochemistry 301
Principles of Protein Structure
Walter Chazin
5140 BIOSCI/MRBIII
E-mail: Walter.Chazin
http://structbio.vanderbilt.edu/chazin
Text Books
Branden and Tooze
Introduction to Protein Structure
Voet, Voet and Pratt
Fundamentals of Biochemistry
Stryer
Biochemistry
Proteins: Polymers of Amino Acids
• 20 different amino acids: many combinations
• Proteins are made in the RIBOSOME
Amino Acid Chemistry
20 different types
R
amino
NH2
Ca
acid
COOH
H
R1
NH2
Ca
R1
COOH
H
NH2
R2
NH2
Ca
COOH
Ca
H
R2
CO NH
Ca
H
H
Amino acid
Polypeptide
Protein
COOH
Amino Acid Chemistry
R
amino
NH2
Ca
acid
COOH
H
The free amino and carboxylic acid groups have pKa’s
NH3+
NH2
COOH
pKa ~ 9.4
COO-
pKa ~ 2.2
R
+NH
3
Ca
COO-
H
At physiological pH, amino acids are zwitterions
Amino Acid Chemistry
Note the axes
Also titratable
groups in side chain
Amino Acids with Aliphatic R-Groups
Glycine
Gly - G
2.4
9.8
Alanine
Ala - A
2.4
9.9
Valine
Val - V
2.2
9.7
Leucine
Leu - L
2.3
9.7
Isoleucine
Ile - I
2.3
9.8
pKa’s
Amino Acids with Polar R-Groups
Non-Aromatic Amino Acids with Hydroxyl R-Groups
Serine
Ser - S
2.2
9.2
~13
Threonine
Thr - T
2.1
9.1
~13
8.3
Amino Acids with Sulfur-Containing R-Groups
Cysteine
Cys - C
1.9
10.8
Methionine
Met-M
2.1
9.3
Acidic Amino Acids and Amide Conjugates
Aspartic Acid
Asp - D
2.0
9.9
Asparagine
Asn - N
2.1
8.8
Glutamic Acid
Glu - E
2.1
9.5
Glutamine
Gln - Q
2.2
9.1
3.9
4.1
Basic Amino Acids
Arginine
Arg - R
1.8
9.0
12.5
Lysine
Lys - K
2.2
9.2
10.8
Histidine
His - H
1.8
9.2
6.0
Aromatic Amino Acids and Proline
Phenylalanine
Phe - F
2.2
9.2
Tyrosine
Tyr - Y
2.2
9.1
Tryptophan
Trp-W
2.4
9.4
Proline
Pro - P
2.0
10.1
10.6
Hierarchy of Protein Structure
• 20 different amino acids: many combinations
Primary Structure
The order of amino acids: Protein sequence
Secondary Structure
Local conformation, depends on sequence
Tertiary/Quaternary Structure
Overall structure of the chain(s) in full 3D
Beyond Primary Structure:
The Peptide Bond
Peptide plane is flat
w angle ~180º
Partial double-bond:
Peptide bond
-
H
-C = N-
-
=
-
H
-C - NO
O-
Resonance structures
Implications of Peptide Planes
f
R
H

Ca
Ca
Ca
Peptide planes
H
 w angle varies little, f and  angles vary alot
 Many f/ combinations cause atoms to collide
 Each residue is sandwiched between two planes
R
Polypeptide Backbone
f
R
H

Ca
Ca
H
R
Ca
H
R
 Backbone restricted  limited conformations
 Collisions with side chain groups further limit f/
combinations
Secondary Structure
Local Conformation of Consecutive Residues
• Three low energy backbone f/ combinations
1. Right-hand helix: a-helix (-40°, -60°)
2. Extended: antiparallel b-sheet (140°, -140°)
3. Left-hand helix (rare): a-helix (45°, 45°)
Glycine: special it has no side chain!
• Hydrogen bonds between backbone atoms
provides stability to secondary structures
• Amino acids have specific preferences
Secondary Structure- a Helix
H-bond
Secondary Structure- b Sheet
Oxygen
Nitrogen
Hydrogen
Carbonyl C
Carbon a
R Group
H Bond
Secondary Structure- b Turn
3
4
2
1
 Reverses direction of the chain
Ribbon and Topology Diagrams
Representations of Secondary Structures
 Sheets (arrows), Helices (cylinders)
B/T- Figure 2.17
Ribbon and Topology Diagrams
Organization of Secondary Structures
helix
B/T- Figure 2.11
Beyond Secondary Structure
Supersecondary structure (motifs): small,
discrete, commonly observed aggregates of
secondary structures
 b sheet
 helix-loop-helix
 bab
Domains: independent units of structure
 b barrel
 four-helix bundle
*Domains and motifs sometimes interchanged*
Protein Motifs
V/V/P- Figure 6.28
Hairpin Motif
B/T- Figure 2.14
Helix-Loop-Helix (H-L-H) Motif
B/T- Figure 2.12
EF-Hand H-L-H Motif
B/T- Figure 2.13
Greek Key Motif
B/T- Figure 2.15
Multi-Domain (Modular) Proteins
Protein
Domain
EGF
Protease
Kringle
Ca-binding
Tertiary Structure
Definition: Overall 3D form of a molecule
 Organization of the secondary structures/
motifs/domains
 Optimization of interactions between residues
 A specific 3D structure is formed
All proteins have multiple secondary
structures, almost always multiple motifs, and
in some cases multiple domains
Tertiary Structure
Specific structures result from long-range interactions
 Electrostatic (charged) interactions
 Hydrogen bonds (OH, N H, S  H)
 Hydrophobic interactions
Soluble proteins have an inside (core) and outside
 Folding driven by water- hydrophilic/phobic
 Side chain properties specify core/exterior
 Some interactions inside, others outside
Tertiary Structure
I. Ionic Interactions (exterior)
Forms between 2 charged side chains:
1 Negative – Glu,Asp 1 Positive – Lys,Arg,His
 Also called “salt bridges”.
 Ionic interactions are pH-dependent (pKa).
 Occurs at the exterior
 NOTE: pKs for in the interior of a protein may be
very different from free amino acid.
Tertiary Structure
II. Hydrogen bonds (interior and exterior)
Forms between side chains/backbone/water:
Charged side chains: Glu,Asp,His,Lys,Arg
Polar chains: Ser,Thr,Cys,Asn,Gln,[Tyr,Trp]
 Not a specific covalent bond – lower energy.
 Occurs inside, at the exterior, and with water.
Tertiary Structure
III. Hydrophobic Interactions (interior)
Forms between side chains of non-polar residues:
Aliphatic (Ala,Val,Leu,Ile,Pro,Met)
Aromatic (Phe,Trp,[Tyr])
 Clusters of side chains- but no requirement for
a specific orientation like an H-bond
 In the protein interior, away from water
 Not pH dependent
Tertiary Structure
IV. Disulfide Bonds (interior and exterior)
Forms between Cys residues:
Cys-SH + HS-Cys  Cys-S-S-Cys
 Catalyzed by specific enzymes, oxidizing agents
 Restricts flexibility of the protein
 Usually within a protein, less for linking proteins
Disulfide Bonding
V/V/P- Figure 16.6
Quaternary Structure
Definition: Organization of multiple chain associations
 Oligomerization- Homo (self), Hetero (different)
 Used in organizing single proteins and protein
machines
Specific structures result from long-range interactions
 Electrostatic (charged) interactions
 Hydrogen bonds (OH, N H, S  H)
 Hydrophobic interactions
 Disulfides only VERY infrequently
Quaternary Structure
The classic example- hemoglobin a2-b2
B/T- Figure 3.7
END OF PART 1
Protein Structure from Sequence
The pattern of amino acid side chains determines
the local conformation and the global structure
*Pattern is more important than exact sequence*
Reporting/Comparing Protein Sequences
h-CaM
b-CaM
A T V R L L E W E D L
A T V R L L E Y K D L
5
conservative
10
non-conservative
Proteins Fold To Their
Native Structure
Folded proteins are only marginally stable!!
 ~0.4 kJ•mol-1 required to unfold (cf. ~20/H-bond)
 Balance loss of entropy vs. stabilizing forces
Protein fold is specified by sequence
 Reversible reaction- denature (fold)/renature
 Even single mutations can cause changes
 Recent discovery that amyloid diseases (eg.
CJD, Alzheimer) are due to unstable protein folding
How Does a Protein Find It’s Fold?
Amino terminus
Carboxyl terminus
N
Residue number 1
C
2
3
4
• 20 different amino acids: many combinations
A protein of n residues: 20n possible sequences!
100 residue protein has 10020 possibilities 1.3 X 10130!
The latest estimates indicate < 40,000
sequences in the human genome 
THERE MUST BE RULES!
Limitations on Protein Sequence
*Length is generally 100-1000 residues*
 Minimum length based on ability to perform a
biochemical function: ~40 residues (e.g. inhibitors)
 Maximum length based on complexity of assembly:
Conversion of DNA code and production of proteins
is carried out by molecular machines that are not
perfect. If the sequence gets too long, too many
errors will build up.
Protein Folding
The hydrophobic effect is the major driving force
 Hydrophobic side chains cluster/exclude water
 Release of water cages in unfolded state
Other forces providing stability to the folded state
 Hydrogen bonds
 Electrostatic interactions
 Chemical cross links- Disulfides, metal ions
Protein Folding
Random folding has too many possibilities
• Backbone restricted but side chains not
• A 100 residue protein would require 1087 s to
search all conformations (age of universe < 1018 s)
• Most proteins fold in less than 10 s!!
Proteins must fold along specific pathways!!
Protein Folding Pathways
Usual order of folding events
 Secondary structures formed quickly (local)
 Secondary structures aggregate to form motifs
 Hydrophobic collapse to form domains
 Coalescence of domains
Molecular chaperones assist folding in-vivo
 Complexity of large chains/multi-domains
 Cellular environment is rich in interacting
molecules Chaperones sequester proteins and
allow time to fold
Progressive Folding of Proteins
From Disordered to Native State
Protein Folding Funnel
V/V/P- Figures 6.37/38
Functional Classes of Proteins
• Receptors- sense stimuli, e.g. in neurons
• Channels- control cell contents
• Transport- e.g. hemoglobin in blood
• Storage- e.g. ferritin in liver
• Enzyme- catalyze biochemical reactions
• Cell function- multi-protein machines
• Structural- collagen in skin
• Immune response- antibodies
Structural Classes of Proteins
1. Globular proteins (enzymes, molecular machines)
 Variety of secondary structures
 Approximately spherical shape
 Water soluble
 Function in dynamic roles (e.g. catalysis,
regulation, transport, gene processing)
Globular Proteins
Hemoglobin a
Conconavalin A
Triose Phosphate isomerase
V/V/P- Figure 6.27
Structural Classes of Proteins
2. Fibrous Proteins (fibrils, structural proteins)
 One dominating secondary structure
 Typically narrow, rod-like shape
 Poor water solubility
 Function in structural roles (e.g. cytoskeleton,
bone, skin)
Collagen: A Fibrous Protein
Triple Helix
Stabilizing
Inter-strand
H-bonds
Gly-Pro-Pro Repeat
V/V/P- Figures 6.17/18
Structural Classes of Proteins
3. Membrane Proteins (receptors, channels)
 Inserted into (through) membranes
 Multi-domain- membrane spanning,
cytoplasmic, and extra-cellular domains
 Poor water solubility
 Function in cell communication (e.g. cell
signaling, transport)
Photosynthetic Reaction Center
Extracellular
Membranespanning
Intracellular
(cytoplasmic)
B/T Figure 13.6
In the physical sense, the
progression of living organisms
results from the communication
between molecules.
Interaction between molecules is
determined by binding affinities.
Binding Classification of Proteins
• Structural- other structural proteins
• Receptors- regulatory proteins, transmitters
• Toxins- receptors
• Transport- O2/CO2, cholesterol, metals, sugars
• Storage- metals, amino acids,
• Enzymes- substrates, inhibitors, co-factors
• Cell function- proteins, RNA, DNA, metals, ions
• Immune response- foreign matter (antigens)
Surface Determines What Binds
1. Steric access
2. Shape
3. Hydrophobic
accessible surface
4. Electrostatic surface
Sequence and structure optimized to generate
surface properties for requisite binding event(s)
Determinants of Protein Surface
Function requires specific amino acid properties
 Not all amino acids are equally useful
 Abundant: Leu, Ala, Gly, Ser, Val, Glu
 Rare: Trp, Cys, Met, His
Post-translational modifications
 Addition of co-factors- metals, hemes, etc.
 Chemical modification- phosphorylation,
glycosylation, acetylation, ubiquination,
sumoylation
Binding Alters Protein Structure
Mechanisms of Achieving Functional Properties
1. Allosteric Control- binding at one site effects changes
in conformation or chemistry at a point distant in space
2. Stimulation/inhibition by control factors- proteins, ions,
metals control progression of a biochemical process
(e.g. controlling access to active site)
3. Reversible covalent modification- chemical bonding,
e.g. phosphorylation (kinase/phosphatase)
4. Proteolytic activation/inactivation- irreversible, involves
cleavage of one or more peptide bonds
Calcium Signal Transduction
Allostery & Stimulation by Control Factor
Calmodulin
Ca2+
Target
SequenceStructureFunction
Many sequences can give same structure
 Side chain pattern more important than
sequence
When homology is high (>50%), likely to have same
structure and function (Structural Genomics)
 Cores conserved
 Surfaces and loops more variable
*3-D shape more conserved than sequence*
*There are a limited number of structural frameworks*
Varied Relationships Between
Sequence, Structure and Function
I. Homologous: similar sequence (cytochrome c)
 Same structure
 Same function
 Modeling structure from homology
C-Type Cytochromes
Same structure/function- Different Sequence
Heme
Constant structural elements and basic architecture
V/V/P Figure 6.31
Varied Relationships Between
Sequence, Structure and Function
I. Homologous: very similar sequence (cytochrome c)
 Same structure
 Same function
 Modeling structure from homology
II. Similar function- different sequence (dehydrogenases)
 One domain same structure
 One domain different
NAD-Binding Domains
Conserved Domains/Functional Elements
Alcohol Dehydrogenase
Lactate Dehydrogenase
B/T Figure 10.8
Varied Relationships Between
Sequence, Structure and Function
I. Homologous: very similar sequence (cytochrome c)
 Same structure
 Same function
 Modeling structure from homology
II. Similar function- different sequence (dehydrogenases)
 One domain same structure
 One domain different
III. Similar structure- different function (cf. thioredoxin)
 Same 3-D structure
 Not same function
NADH-Binding and Redox
Same structure- Different Function
Thioredoxin
Alcohol Dehydrogenase
Lactate Dehydrogenase
B/T Figures 10.8/2.7
Download