LectureV

advertisement
V. Evolution of Protein Structure and Function
•Protein structure classification
•Structural relationships among homologous proteins
•Changes in proteins during evolution uncovers
functionally/structurally important amino acid sites
•Domain swapping
•Classification of protein folding patterns
•How do proteins evolve new functions?
•Classification of protein functions
Super secondary Structures (I)
b1
• Hairpins connect two antiparallel strands;
b2
• Cross-overs connect two parallel beta strands, most common through
an a-helix (b-a-b topology). All cross-overs are right-handed. That is,
when placing C-side strand closer and pointing right, the connecting ahelix or loop is on the top of the sheet;
b1
b1
b2
b2
Right-handed Cross-over
Left-handed Cross-over
Super Secondary Structures (II)
•
Coiled-coil is a common alpha helix structure found in proteins that participate
in protein folding and protein-protein interactions.
– (a-b-c-d-e-f-g)n, where a and d are
nonpolar that leads to a hydrophobic side
•
Helix bundles refers to three or more helices packing together;
– Knobs into holes packing:
In both kinds of helix packings, slight distortion
of the individual helices and the
inclination of their axes with respect
to each other allows the side chains
of the nonpolar residues to mesh together
b-barrels
It is like a sheet wrapped around a cylindre
The Hierarchical nature of protein architecture
• Primary structure
– Proteins are first synthesized as linear sequences of
amino acids
• Secondary structure
– The linear sequence can undergo simple packing in
regions of local regularity
• i.e., a-helices, b-strands, -sheets & -turns
• Super-secondary structure
– the packing of secondary structure elements into stable
units
• e.g., b-barrels, bab units, Greek keys, etc.
Most of the secondary structured proteins are folded to
protect hydrophobic regions (Tertiary structures)
• Tertiary structure
– The complex folding of packed secondary structures
give the tertiary structure of the protein
Some proteins work as multi-complex machines and have to
undergo a quaternary level of folding.
• Quaternary structure
– the arrangement of separate chains within a protein that
has more than one subunit
• e.g., haemoglobin
The highest level of organisation is the Quinternary
structure
• Quinternary structure
– the arrangement of separate molecules, such as in
protein-protein or protein-nucleic acid interactions
Protein domains: compact units within the folding pattern of
single chains, that look as if they should have independent
stability
Modular proteins are multidomain proteins which often
contain many copies of closely related domains.
A Domain is a compact, semi-independent region of
100-150 amino acids that has a hydrophobic core
and hydrophilic exterior.
Domains can be structural and/or functional
Bundle structural domain
b-barrel structural domain
Glyceraldehyde-3-phosphate
dehydrogenase has
two functional domains
Glyceraldehyde-3-phosphate
binding domain
NAD+
binding domain
Quaternary Structure
Spatial arrangement of protein subunits and the
nature of their contacts.
Hemoglobin Tetramer
Immunoglobulin Quanternary Structure
Evolutionary changes in protein sequences
Events responsible for the generation of diversity:
- mutations
- insertions
- deletions
- transposition of large dna pieces
Selection reacts to protein function as determined with protein
structure
A mutant gene may determine a protein with:
- equivalent function (neutral mutation)
- new and optimised function (adaptive evolution)
- new and sub-optimised function (purifying selection)
Evolution and proteins
• You can see the effects of evolution, not only in the whole organism,
but also in its molecules - DNA and protein
• For a mutation to have an effect on the phenotype (and be subject
to selection) it must (usually) affect the structure or function of a
protein
• You can learn a lot about evolution by studying the structure of
proteins
Evolution in a population may occur through positive or
negative selection or through the neutral fixation of proteinfunction variants
Proteins from different species have similar but not identical
sequences. This fact implies that they have similar but not
identical protein structures
Gilbert maintained that exons represent structural components
of proteins that can be recombined in different contexts, as a
mechanism of generation of new protein folds.
This suggestion could not been supported below the protein
domain level
Table of alignment of amino acid sequences is a very useful tool
for evolutionary studies and provide more information than
structure does
The pattern of variation at the amino acid level give clues of the
selective constraints operating in the sequence or even in the
protein structure
It is possible to construct phylogenetic trees derived from
tabulations of related sequences.
Phylogenies derived from different families of proteins from the
same range of species are mutually consistent with the
branching order
To infer phylogenetic relationships between species through
genes it is important to choose functionally equivalent proteins
One of the hypotheses that have gained much attention is the
molecular clock hypothesis, which suggests that amino acid
substitutions proceed at a constant rate within protein families
A molecular clock
• Plot the number of changes in amino-acids between the
same protein in different species (such as cytochrome C)
against the time since the species diverged
• Gives a straight line - so evolution of a protein sequence
proceeds at a constant rate and therefore can be used as a
clock
Calibration of the clock for specific protein families would
ensure the dating of biological events not present in the fossil
record and would imply that changes are non-adaptive due to
their independence of the selective constraints
Variability of selective constraints in protein
molecules
Amino acid substitution rates do vary between:
- Different protein families
- proteins within the same family
- amino acid regions in the same protein
The main reason for the variability in the substitution rates
among amino acid regions is that different amino acids are
under different functional and structural constraints
Those amino acids playing less important functional or
structural role can fix greater number of mutations due to their
neutral effect on the biological fitness of the protein
Evolution of protein structure
In families of closely related proteins, mutations alter the
specificity of proteins rather than changing their structure
- Family of serine proteinases
-specificity of haemoglobin by other ligands
In very few cases punctual mutations alter the protein in such a
way that novel functions arise, being the chymotryosin family of
serine proteinases a clear example of the emergence of novel
functions:
- Haptoglobine = chymotripsine – proteolytic activity.
Acts as a chaperone, preventing protein aggregation
- Serine proteinases of rhinoviruses forms the initiation
complex of RNA synthesis
Neutral evolution vs selection
Non-synonymous nucleotide substitution
Amino acid replacements
changes Protein function or
structure
Neutral Theory of molecular evolution
Purifying selection
Amino acid
changes
Neutrality
Positive selection
Biological
fitness (W)
Selection: Positive & Negative
One sequence scenario
Population scenario
A
A
A
A
C
C
A
One sequence scenario again
ThrSer
ACGTCA
ThrPro
ACGCCA
A
A
A
A
A
A
A
C
C
A
A
A
C
C
The selection criteria could in principle
be anything, but the selection against
amino acid changes is without
comparison the most important.
ArgSer
AGGCCG
ThrSer
ACGCCG
ThrSer
ACTCTG
AlaSer
GCTCTG
AlaSer
GCACTG
Certain events have functional
consequences and will be selected
out. The strength and localization of
this selection is of great interest.
Domain combination and recombination
One mechanism to ensure the generation of different partners is
gene duplication followed by divergence
In some cases catalytic domains can be formed by the
contribution of both duplicates (paralogues)
Serine proteinases domains
In some others, gene duplication provides an additional
regulatory function, by development of an oligomeric protein
mutations on the tetrameric structure of haemoglobin
can turn the allosteric structure efficiency in
transporting oxygen
Proteins can combine gene duplication or fusion with generation
of partners by domain swapping
IL-5
A
A
B
B
B
A
Two-domain
monomer
Domain-swapped
dimer
Families of related proteins tend to retain similar folding
patterns
The general folding pattern of a protein use to be preserved even
with amino acid substitutions. The amount of structural
distortions, however, increases locally with the increase in the
amino acid sequence divergence between two proteins
These distortions are not uniformly distributed in the structure but
seems that the core preserve the folding pattern in a family, with
other parts of the structure suffering dramatic distortions
In the overwhelming majority of proteins, the core is formed by the
main elements of secondary structures and peptides flanking
them, including active sites peptides
The fraction of identical residues in the core measures the amount
of sequence divergence between two proteins
proteins related in more than 60% of amino acids, the core
contains more than 90% of the residues, the refolding of
the remaining 10% will involve minor surface loops
Pairwise Sequence Identities and Structure Similarity
(SSAP) Scores in Domain Families
structure
similarity
(SSAP)
score
same function
different function
sequence identity (%)
ATP Grasp Superfamily
Biotin Carboxylase
D-Ala D-Ala Ligase
In distant homologues the structure can be embellished - but
50-60% of the structure in the core is highly conserved
Conservation of Protein Structure
the cores of protein structures are very well conserved during evolution
even when their sequences have changed considerably
comparing protein structures allows us to identify more distant
evolutionary relationship
Structural Genomics initiatives will give structures for most of the major
protein families
Related structures
RMSD usually < 3.5A
Evolution of New Protein Functions
gene duplication
incremental mutations
gene fusion
oligomerisation
Protein structures can accommodate many but not all single-site
mutations
Some of this single mutations are very important from the medical
point of view:
SNPs can produce incorrect chain termination (some
Thalassaemias)
As qualitative rules, we should know that single mutations on the
surface of proteins use to be innocuous. Mutations in important
buried regions of the molecule will be lethal and removed by
selection (we will never see it)
Natural protein variants are only a subset of all possible variants
that have been subjected to natural selection
Artificial variants can extend our knowledge beyond the
imaginable and show as the possible subsets of optimising
proteins
The allumwandlung technique consists on the substitution
of a single amino acid by the other 19, testing of functional
properties, and their crystal-structure solution
In case we could predict the effect of single mutations on the
protein structure and function, that would be a first step to design
more optimum proteins with a clear relatedness to public health
Download