The Concept of Functional Constraint

advertisement
The Concept of
Functional Constraint
The intensity of purifying selection is determined by
the degree of intolerance characteristic of a site or a
genomic region towards mutations.
The functional or selective constraint defines the range
of alternative nucleotides that is acceptable at a site
without affecting negatively the function or structure
of the gene or the gene product.
DNA regions, in which a mutation is likely to affect
function, have a more stringent functional constraint
than regions devoid of function
The stronger the functional
constraints on a macromolecule
are, the slower its rate of
substitution will be.
Functional density (Zuckerkandl 1976)
The functional density, F, of a gene is defined
as ns/N, where ns is the number of sites
committed to specific functions and N is the
total number of sites. F, therefore, is the
proportion of amino acids that are subject to
stringent functional constraints.
Functional density (Zuckerkandl 1976)
The higher the functional density, the lower
the rate of substitution is expected to be.
Thus, a protein in which the active sites
constitute only 1% of its sequence will be less
constrained, and therefore will evolve more
quickly than a protein that devotes 50% of its
sequence to performing specific biochemical
or physiological tasks.
Substitution rates and disease:
The case of Gaucher disease
Gaucher disease is an autosomal recessive lysosomal storage disorder due to
deficient activity of an enzyme called acid b-glocosidase. There are many
subtypes of Gaucher disease with fitness effects ranging from slight reduction
in fitness to perinatally lethal, in which death occurs during the period
between 154 days of gestation to seven days after birth.
We aligned the amino acid sequences of acid b-glocosidase from nine placental
mammals (human, chimpanzee, Sumatran orangutan, bovine, pig, dog, horse,
rat, and mouse). The length of the alignment (excluding one gap due to a codon
deletion in the ancestor of mouse and rat) was 496 amino-acids, of which 387
were identical in all nine species and 109 were variable..
Thirty-six single amino-acid replacements (at 34 amino-acid
positions) resulting in Gaucher disease are described in the
literature. Perinatal lethal mutations are shown in red.
All 36 deleterious mutations occur at completely conserved sites (below
asterisks). The expectation under a random model is that only 28 mutations
should occur at completely conserved sites. This statistically significant nonrandom association between disease and evolutionary conservation (p = 0.0002)
indicates that invariable sites are conserved because they evolve under extremely
stringent functional constraints and cannot tolerate change.
A network (or graph) is an abstract representation
of a set of objects where some objects are
connected to one another. The objects are
represented by vertices (or nodes), and the links
that connect the vertices are called edges (or
branches).
Protein-protein interaction networks
(a) A simple example of a protein-protein
interaction network consisting of five
proteins (A-E), represented by the nodes,
each of which interacts with at least one
other protein. There are five interactions,
denoted by the links.
In biological networks, three variables are
usually studied:
(b) degree centrality or connectiveness = the
number of interactions for a protein.
(c) betweenness centrality = the number of
times that a node appears on the shortest
path between all pairs of nodes.
(d) closeness centrality = the mean number
of links connecting a protein to all other
proteins in the network.
Download