The Concept of Functional Constraint The intensity of purifying selection is determined by the degree of intolerance characteristic of a site or a genomic region towards mutations. The functional or selective constraint defines the range of alternative nucleotides that is acceptable at a site without affecting negatively the function or structure of the gene or the gene product. DNA regions, in which a mutation is likely to affect function, have a more stringent functional constraint than regions devoid of function The stronger the functional constraints on a macromolecule are, the slower its rate of substitution will be. Functional density (Zuckerkandl 1976) The functional density, F, of a gene is defined as ns/N, where ns is the number of sites committed to specific functions and N is the total number of sites. F, therefore, is the proportion of amino acids that are subject to stringent functional constraints. Functional density (Zuckerkandl 1976) The higher the functional density, the lower the rate of substitution is expected to be. Thus, a protein in which the active sites constitute only 1% of its sequence will be less constrained, and therefore will evolve more quickly than a protein that devotes 50% of its sequence to performing specific biochemical or physiological tasks. Substitution rates and disease: The case of Gaucher disease Gaucher disease is an autosomal recessive lysosomal storage disorder due to deficient activity of an enzyme called acid b-glocosidase. There are many subtypes of Gaucher disease with fitness effects ranging from slight reduction in fitness to perinatally lethal, in which death occurs during the period between 154 days of gestation to seven days after birth. We aligned the amino acid sequences of acid b-glocosidase from nine placental mammals (human, chimpanzee, Sumatran orangutan, bovine, pig, dog, horse, rat, and mouse). The length of the alignment (excluding one gap due to a codon deletion in the ancestor of mouse and rat) was 496 amino-acids, of which 387 were identical in all nine species and 109 were variable.. Thirty-six single amino-acid replacements (at 34 amino-acid positions) resulting in Gaucher disease are described in the literature. Perinatal lethal mutations are shown in red. All 36 deleterious mutations occur at completely conserved sites (below asterisks). The expectation under a random model is that only 28 mutations should occur at completely conserved sites. This statistically significant nonrandom association between disease and evolutionary conservation (p = 0.0002) indicates that invariable sites are conserved because they evolve under extremely stringent functional constraints and cannot tolerate change. A network (or graph) is an abstract representation of a set of objects where some objects are connected to one another. The objects are represented by vertices (or nodes), and the links that connect the vertices are called edges (or branches). Protein-protein interaction networks (a) A simple example of a protein-protein interaction network consisting of five proteins (A-E), represented by the nodes, each of which interacts with at least one other protein. There are five interactions, denoted by the links. In biological networks, three variables are usually studied: (b) degree centrality or connectiveness = the number of interactions for a protein. (c) betweenness centrality = the number of times that a node appears on the shortest path between all pairs of nodes. (d) closeness centrality = the mean number of links connecting a protein to all other proteins in the network.