Basic structures of proteins

advertisement
Basic structures of proteins
Structural Hierarchy of Protein
Primary structure
.
Functional elements : α-helix, strands, β-sheet, loops.
- Structure, affinity, activity, specificity, stability etc.
Functional element A
Secondary structure
Functional element B
Functional element C
Motif
Fold (Scaffold)
Structural motifs: super secondary structure
• Commonly recurring substructures
• Connectivity between secondary structural elements.
• An individual motif usually consists of only a few elements.
• Motifs do not allow us to predict the biological functions: they are found in proteins and enzymes
with dissimilar functions
- Fold: Arrangement of secondary structure elements of the structure (a total of 1282 folds)
- Loops: Irregularly folded segments of polypeptide chain that connect the helixes and sheets
- Usually exposed to solvent and are short
• β hairpin:
- Extremely common.
- Two antiparallel β strands connected by a tight turn of a few amino acids between them.
• Greek key: a decorative border constructed from a continuous line, shaped into a repeated motif
- 4 β strands folded over into a sandwich shape
• Omega loop:
- A loop in which the residues that make up the beginning and end of the loop are very close together
• Helix-loop-helix:
- Consists of α helices bound by a looping stretch of amino acids. This motif is seen in transcription factors.
• Zinc finger:
-Two β strands with an α helix end folded over to bind a zinc ion. Important in DNA binding proteins.
• Helix-turn-helix:
- Two α helices joined by a short strand of amino acids and is found in many proteins that regulate
gene expression.
• Nest:
- Extremely common. Just three consecutive amino acid residues form an anion-binding concavity.
• Niche:
- Extremely common. Just three consecutive amino acid residues form a cation-binding feature.
Named after a pattern common to Greek ornamental artwork
Domains
- A part of a protein that can exist independently of the rest of the protein chain
- Functional aspect
Assembly of proteins from building blocks
Layered sandwich structures, with each layer consisting of either α helixes or ß sheets
• Classification by the packing
- α/ α (all α)
- ß/ß (all ß )
- α/ ß: α and ß elements are in a mixed order in the sequence
- α + ß: α and ß elements are segregated in the sequence
Packing of secondary structure
• Hydrophobic effect: Major driving force for the folding of proteins
- Burying and clustering of hydrophobic side chains to minimize their contacts with water
• Basic requirements for folding:
1. Compact structure and minimization of hydrophobic surface area
2. Buried hydrogen bonding groups are all paired
- Formation of a helixes and b sheets maximizes the pairing of the hydrogen bonding groups of the backbone
- Packing of a helixes and b sheets by stacking their amino acid side chains
• Packing density :
Protein : ~ 0.75
Crystals : 0.7 ~ 0.78
Close-packed spheres : 0.74
Infinite cylinders : 0.91
• Quaternary structure: Overall organization of subunits
- Contact interface of the subunits are closely packed as the
protein interiors
- Charged and hydrogen-bonding groups on the surface are
paired with complementary partners
Definition of the terms
• Homologue : Evolved from a common ancestral protein
 Their evolutionary relationship is evident from similarities in sequence, structure and/or function
• Analogue : Proteins that are similar in some way, yet show no evidence of a common ancestry.
 Structural analogues share the same fold, and functional analogues perform the same function
• Paralogue : evolved by gene duplication within a genome and have a distinct, but usually related
function
• Orthologue : Equivalent genes in different species that evolved from a common ancestor by speciation
Protein evolution and diversity
- ~ 50,000 proteins
- Structural analysis : ~1,000 scaffolds
- Use of a limited repertoire of scaffolds for large diversity of functions
diversification
fine-tuning
E1
E2
Duplication
& modification
E3
progenitor
E4
undifferentiated
scaffold
Drastic gene rearrangement
Insertion, deletion, and substitution
of gene segments, point mutations
primitive
Proteins/ enzymes
E5
superfamily
proteins/enzymes
Fine-tuning
Accumulation
of point mutations
Sequence space and evolution of proteins
Same superfamily
same family
Different family
④
Different superfamily
Fitness
③
②
①
⑥
Sequence space
①
②
③
④
⑤
⑥
Same fold
Sequence space
Incremental improvement of protein property: specificity, activity, stability, expression
Divergent evolution within family: substrate/cofactor specificity, enantio-selectivity
Divergent evolution within superfamily: αβ hydrolase, enolase, crotonase superfamily
Divergent evolution within fold: alteration of sub-binding/catalytic machineries
Convergent evolution between folds: grafting sub-binding/catalytic machineries into different fold
Directed evolution: find optimum fitness
Divergent and convergent evolution of proteins : serine proteases
•
•
•
•
Mammalian serine proteases: Common tertiary structure and function Superimposable polypeptide backbones
About 60 % of the amino acids in the interior, but 10 % of the surface residues , are conserved
Catalytic triad of residues :Asp-His-Ser
Different substrate specificity
Chymotrypsin
Trypsin
Elastase
Subtilisin
Catalytic property
• Nucleophile: hydroxyl oxygen of Ser
• Formation of acyl-enzyme through esterification of the hydroxyl of the reactive serine
by the carboxyl portion of the substrate
• Major difference in substrate specificity from changes in three loops forming the lining of the binding pocket
- Chymotrypsin suitable for large hydrophobic side chains of Phe, Tyr, and Trp
 Small residues at the binding pocket
- Trypsin: Negatively charged aspartate at the bottom
 Forms a salt linkage with the positively charged ammonium or guanidinium such as Lys and Arg
- Elastase: Bulky Val and Thr at the entry of the pocket
 prevent the entry of large side chains into the pocket
 suitable for small hydrophobics like Ala
• Catalytic triad: Arg-His-Ser
Catalytic mechanism of serine protease
Structural and mechanistic information
•
•
•
•
Binding site of the enzymes : approximately complementary to the structures of the substrates
Interactions : Non-polar parts of the substrate match up with non-polar side chains of the amino acids
Hydrogen-bonding sites on the substrates bind to the backbone NH and CO groups of the protein
Reactive part of the substrate is firmly held by this binding next to acidic, basic, or nucleophilic
groups on the enzyme
Provide a strategy and insight to engineering and design of enzymes
Convergent or divergent evolution ?
• Criteria for evolution from a common ancestor : Descending order of strength
- DNA sequences coding for enzymes are similar?
- Amino acid sequences are similar
- Three-dimensional structures are similar?
- Enzyme-substrate interactions are similar?
- Catalytic mechanisms are similar?
- Segments of polypeptide chain essential for catalysis are in the same sequence?
• Mammalian serine proteases: Divergent evolution
• Catalytic mechanism with subtilisin : Convergent evolution
Three-dimensional structure is more conserved than primary structure
but function has changed
α/ß barrel protein (or Tim Barrel): Convergent evolution
- Eight parallel ß strands connected by eight helixes
-
Strands form the staves of the barrel while the helixes are on the outside and parallel
Hydrophobic core composed of the side chains of strands, Val, Ile, Leu
Catalyze a variety of reactions and have diverse subunit compositions
No homology with the enzymes that catalyze the same reaction in different organisms
Active site: eight loops connecting the carboxyl end of each strand
Little sequence identity, and active sites use different regions of the loops
Relationship between three-dimensional structure and similarity of sequence
Dependent on the length of the protein
- The longer the protein, the lower the percent identity that implies similar structure
- The higher sequence identity, the lower the RMSD between proteins
- For a protein of 85 residues, 25~ 30 % sequence identity implies identical fold.
Multi-enzyme complexes
• Encoded in a single polypeptide chain
• Involved in sequential steps in a biosynthetic pathway or complex biochemical process
- Tryptophan synthase : tetramer (α2ß2)
- Polyketide synthase
- Fatty acid synthase
• Advantages of multi-enzyme complexes
- Enhanced catalytic efficiency:
Reduced diffusion time of an intermediate
Substrate channeling :
ex) Tryptophan synthase: Reaction intermediate (indol) is not released, but shuttled directly
between the subunits through 20 ~ 30 A long channel, and directed to the next reaction
- Sequestration of reactive intermediates: protection of chemically unstable intermediates from water
- Easy coordination for regulating the reaction
- Easy coordination of expression during biosynthesis
Polyketide synthases(PKSs)
• Polyketide: a large class of diverse compounds that are characterized by more than two carbonyl
groups connected by single intervening carbon atoms
• PKSs: A family of multi-domain enzymes or enzyme complexes that produce polyketides,
a large class of secondary metabolites:
ex) Antibiotics(tetracyclin and macrolides, erythromycin), Anticholesterol drug (lovastatin)
Immunosuppressant(sirolimus), Anticancer drug: epothilone B
• Share striking similarities with fatty acid biosynthesis
• The PKS genes are usually organized in one operon in bacteria and in gene clusters in eukaryotes
The order of modules and domains of a complete polyketide-synthase
• Starting or loading module: AT-ACP
- Starter group, usually acetyl-CoA or malonyl-CoA, is loaded onto the ACP domain of the starter module
catalyzed by the starter module's AT domain
• Elongation or extending modules: KS-AT-[DH-ER-KR]-ACP- Polyketide chain is handed over from the ACP domain of the previous module to the KS domain
of the current module, catalyzed by the KS domain
• Termination or releasing module: TE
- TE (thio-esterase) domain hydrolyzes the completed polyketide chain from the ACP-domain
of the previous module
Flexibility and conformational mobility of proteins
• Flexibility of proteins even though globular proteins are closely packed
• Undergo conformational changes on binding ligands or substrates
• Conformational changes play an important role in a certain class of enzymes (allosteric) for modulating activity
- Allosteric effectors: alter the shape of the protein : Hemoglobin
• Equilibrium among two or more conformations of the protein in solution
Allostery: A Phenomenon in which binding of a substrate, product, or other effector to a subunit of a multi-subunit enzyme at a site (allosteric site)
other than the functional site alters its conformation and functional properties.
Modes of motion and flexibility of proteins
• Molecular tumbling
- Globular proteins rotate in solution at frequencies close to those calculated for rigid sphere
- Rotational correlation time (ϕ) : Time taken to rotate through a defined angle
Reciprocal of the rate constant for the randomization of the orientation of the molecule by Brownian motion
- For a rigid sphere, ϕ = V ƞ /kT
V: molecular volume, ƞ: viscosity of the medium, k: Boltzmann’s constant, T: absolute temperature
Approximately, ϕ = Mr/2000 ns, Mr: molecular mass of a globular protein
ex) Chymotrypsin(Mr= 25,000), ϕ= 12 ns
Rotation of side chains
• NMR : most powerful technique for studying the mobility of individual amino acids
• Measurement of rotational freedom of the aromatic side chains of Tyr and Phe about the Cβ-Cγ bond: H1 NMR
- Detect whether or not the aromatic ring is constrained in an anisotropic environment
- Slow rotation: 1 ~ 10 /s
- Fast rotation: 10 4 ~ 10 5/s
• Surface amino acids are more mobile than interior ones, showing no unique conformation
Domain movement: hinge motion and segmental flexibility
• Larger scale movement in proteins with low energy barriers
• Hinge motion:
- Two elements of structure undergo open and closed conformation as if connected by a hinge
- MBP, Abl protein kinase (N-lobe and C-lobe)
- Detection: time-resolved fluorescence polarization spectroscopy, NMR
Protein mobility in solution
• Incorporation of 15N into the protein Analysis of relaxation of 15N-NMR signals
• The term ‘relaxation’ describes how signals change with time.
- Signals deteriorate with time, becoming weaker and broader
- The deterioration reflects that NMR signal arises from the over-population of an excited state,
fluctuation in backbone structure
Download