Honors 227 Fall 2003
Laboratory No. 10
Evolution at the Molecular Level
As discussed in lecture, one of the brilliant features of the theory of evolution is that it provides an
explanation for two seemingly disparate features of living systems:
Tremendous diversity in life forms (plants, animals and microbes) and
Tremendous similarity in life forms at the molecular level.
While you may not look much like a rat in a morphological sense, at the molecular level you share the vast
majority of your genome with that of the rat (not to mention a tree or an earthworm or….). Thus, evolution
provides the mechanism for explaining the diversity and similarity of life at the same time.
The most salient features of the theory of evolution are the following:
Variation exists among individuals within a population/species in fitness-related traits (traits that
affect survival and reproduction);
 Variation is genetically based; and
 Natural selection favors individuals with a higher level of fitness, defined as the ability to compete
for limited resources and to reproduce
Using these three features of evolution, one can explain a great deal of how organisms evolved at the level
of morphology, physiology and biochemistry.
The following biochemical sequence of events in a cell was part of last week’s lab, and the sequence is
common to most forms of organisms on the surface of the Earth (it is also important for this week’s lab):
Morphological differences emanate from the far right-hand side of this sequence as proteins affect cellular
and organismal traits. For example, the color of your eyes can be traced to specific proteins, which differ
between brown and blue-eyed individuals. These protein differences are due to the sequence of amino
acids that are linked together covalently in the ribosomes as determined by the mRNA (translation). In
turn, the sequence of nucleotides in the mRNA is genetically programmed by the DNA in the nucleus in
the process of transcription. A typical protein might have 500 amino acids in a linear sequence, and this
sequence might look like the following (all of which are individual amino acids, covalently bound):
There are four key points:
1. Each amino acid’s exact location in the sequence is genetically determine (thus, at Site No. 5,
alanine will always be present, and this will be the same for all of the same proteins produced
in your body as determined by the corresponding gene);
2. Each type of amino acid can be used many different times (e.g., alanine is at Site No.‘s 5 and
6; there are a total of 20 amino acids available for protein synthesis);
3. The functionality of the protein is determined by the sequence of amino acids and even one
substitution of one amino acid can alter the functionality of the protein; and
4. Occasionally, the DNA material is changed in the process of mutation, resulting in a new
amino acid being substituted (e.g., given a change in DNA, proline at site No. 9 might be
replaced by a tryptophan). This mutation is now an inherited trait and all subsequent
proteins from this one gene will show this new amino acid at Site No. 9. Natural selection
will determine if the individual with this new protein has greater or lesser effectiveness
As a consequence of this DNA-mRNA-amino acid linkage, if one knows the sequence of amino acids in a
protein, one can infer many features of the inheritance of that trait without doing DNA extractions and
DNA sequencing. This is principle of the lab today, to use the sequence of amino acids to understand how
DNA (information broker of the cell) has changed over time in the evolution of animal species on the
surface of the Earth.
As an example, if you had the following information, you should be able to infer the “relatedness” of
multiple species even if you did not know the species’ identify:
No. of Amino Acid Differences
(Expressed as the # of amino acid differences from species A)
0 (i.e., organism A is identical with itself)
In this case, you would argue that Species A is closely related to Species C (only 2 amino acid differences)
and distantly related to Species B (45 amino acid differences). Of the intermediate organisms (Species D
and E), Species A is more closely related to Species E (14 amino acid differences) than to Species D (21
amino acid differences). Graphically, you could illustrate this analysis as follows, with the distance
between arrows roughly proportional to the differences in amino acids between Species A and the
remainder of the species:
Another example of the technique you will be applying is best presented as an analogy. Suppose you are a
publisher/copy editor, having received five essays in support of an advertising campaign, and each essay
was limited to 1,000 words (total of 4,000 characters). Of the five essays, all showed a tremendous degree
of “relatedness” in that the essays were nearly identical (to the letter). The differences were very minor,
including less than 1% of the entire essay. The probability of this happening by chance alone is very small.
In looking at the five essays, it is noted that differences show a pattern. Essay No. 1 is excellent, without
any blemishes (no misspelling, no grammatical errors, no syntax concerns…must be an Honors student!).
Essay No. 2 is identical to that of Essay 1 with one exception: Word No. 357 is substituted with a new word
and the sentence’s syntax is nonsensical. In this case, one would argue that Essay 1 was the original and
Essay No. 2 is a derivative of Essay 1.
In further analysis of the essays, Essay 3 is identical to essay No. 2, having the same “substitution” at Word
No. 357, plus another word substitution and Word No. 601. Essay No. 4 is identical to Essay No. 3 but has
an additional new substitution at Word No. 185. Essay No. 5 was nearly identical to that of Essay No. 4,
containing all the old substitutions (Word No.’s 357 + 601 + 185) plus three new ones (Word No.’s 27, 653
and 891).
The sequence of errors is based on what substitutions that arise and the similarity with the other essays.
This sequence can be schematically portrayed as follows:
Essay No.
This essay-to-DNA analogy is very appropriate, since most DNA sequences are nearly identical among
species, with an occasional substitution in DNA resulting in a change in amino acid. Deciphering the
relatedness among species (which reflects evolutionary processes) essentially focuses on only the small
number of changes, ignoring the remainder that is identical. It is clear that of the five essays, all are related
and all (except the first) were “mutated” and copied with fidelity thereafter. Using this principle of
mutation and fidelity, a simple order of relatedness can be inferred.
In nature the source of the genetic variation among individuals is DNA mutations, which are then manifest
in amino acids differences in the proteins. The most common mutation in DNA is a single nucleotide
substitution (e.g., adenine for cytosine), and these can occur spontaneously due to copy errors at a very
low rate or can be induced by the environment (e.g., UV-B radiation; some chemicals that are mutagens;
and radiation). Once the DNA mutation is established, the replication process assures that the mutation is
preserved with fidelity in all subsequent generations.
The objectives of this laboratory are the following:
Investigate the amino acid sequence for the protein myoglobin gene from 17 different species of
animals (myoglobin helps the transport of oxygen in the blood);
Focusing on the amino acids that are mutations, establish the degree of relatedness among the
species; and
Answer the questions regarding the relationship between DNA, amino acids, genetic variation,
natural selection and evolution.
The most valuable methodology is simple deductive reasoning coupled with the procedures used in the
examples above. Table 1 (spreadsheet on legal-sized paper) presents 17 species of animals (Species 1-17;
column 1). The next series of columns presents the sequence of amino acids beginning with the first
through amino acid No. 152. The sequence is broken into 10 amino acid units simply to facilitate analysis
(nature does not use Base 10!!!). The abbreviations for the amino acids are not important except to know
that each letter refers to a different amino acid (there are a total of 20 different amino acids). The sequence
is shown in regular type and bold face type; the latter is done to easily show the amino acid substitutions
that are different from Species 1. In this case, Species 1 is the amino acid sequence for human myoglobin.
Occasionally in the table, you will notice that some of the amino acids have been deleted rather than
replaced. This too is a mutation and it is noted as an “*”. When you are tallying the number of mutations,
include both substitutions and deletions.
Exercise No. 1
Using the Table No. 1, count the number of mutations that separate each species relative to that of Species
1. In essence all you need to do is count the number of bold-faced letters in each row plus the number of
deletions (“*”). Record the number of mutations in Table 2.
Table 2
Species No.
Number of Mutations
1 ___________________ 0
2 ___________________
3 ___________________
4 ___________________
5 ___________________
6 ___________________
7 ___________________
8 ___________________
9 ___________________
10 ___________________
11 ___________________
12 __________________
13 __________________
14 __________________
15 __________________
16 __________________
17 __________________
To help visualize the pattern in the data, draw a diagram similar to that used in the examples, illustrating
the “relatedness” among species by having distance along the axis being proportional to the number of
mutations. Use Figure 1 (below) to illustrate this relatedness.
Figure 1
Species Number
Simply record the vertical bar for each species, with the location being proportionally adjusted from that of
Species 1 (Human). Indicate the species number above the bar. Do not include the other bars from the
previous figures – those were for illustrative purposes. The “tic” marks on the axis are increments of 10
mutation differences, ranging from 0 to 130.
Answer Questions No. 1-6.
Exercise No. 2.
Look at Species 14,15, and 16 and note that these species have the exact same deletions within the first 10
sites of the protein.
Based on this observation it should be obvious that these three species are very related to one another. The
alternative hypothesis is that these deletions occurred by chance solely in these three species and by chance
at the same two sites out of 150. That is a very low probability.
Now look at the amino acid sequence of Species 14, 15 and 16 to see if there are other deletions that might
help further in decipher the relatedness among just these three species. Answer Question No. 7.
The flowing is the key that relates Species No. with species name:
Species No.
Species Name
Human (Mammal)
Chimpanzee (Mammal)
Slug (Mollusc)
Pig (Mammal)
Otter (Mammal)
Sea Lion (Mammal)
Porpoise (Mammal)
Kangaroo (Mammal)
Whale (Mammal)
Elephant (Mammal)
Platypus (Mammal)
Mouse (Mammal)
Penguin (Bird)
Tuna (Fish)
Carp (Fish)
Shark (Fish)
Chicken (Bird)
Return to Exercise No. 1 (Table 2) and add the species name to this table. Are there any patterns that begin
to make sense based on your understanding of the biology of animal species? Are there species that are
more (or less) related to humans than you thought?
It has been calculated that on average it takes 100 Million years to accumulate 18 amino acid
differences between any two species. Given this information, one can project the actual time since
one species diverged from another biochemically. Using this ratio (100 M years/18 amino acids),
estimate the time since humans have diverged from all other species. Enter the data in Table 3.
Species No.
Species Name
Human (Mammal)
Chimpanzee (Mammal)
Slug (Mollusc)
Pig (Mammal)
Otter (Mammal)
Sea Lion (Mammal)
Porpoise (Mammal)
Kangaroo (Mammal)
Whale (Mammal)
Elephant (Mammal)
Platypus (Mammal)
Mouse (Mammal)
Penguin (Bird)
Tuna (Fish)
Carp (Fish)
Shark (Fish)
Chicken (Bird)
Table 3
# Mutations
(from Table 2)
Divergence in Millions of Years
What is the maximum number of years of divergence among humans and any other species? Is this
number less than or greater than the time since the formation of the Earth (4.5 Billion years)?
Is the maximum time less than or greater than the time of the breakup of Pangaea (200 Million years
Suppose you had a mutation at Site No. 68 of the myoglobin gene that changed the DNA nucleotide
from Thymine to Glycine. In that process, the new myoglobin protein was only 25% as effective in
transporting oxygen as its predecessor. In light of the processes controlling evolution (i.e., natural
selection), what would be the fate of this new mutation? Explain your answer using evolutionary
In the same predecessor myoglobin sequence, assume that the substitution resulted in the new
protein being 50% better in its effectiveness to transport oxygen throughout the body. Using
evolutionary terms, explain what you would expect to happen in future generations. What factors
might control the rate at which your argument would proceed? Phrased differently, would the
change at the species level happen immediately or would it require several to many generations? In
either case, what processes might control the how fast changes would occur?
In the same exercise above, assume that the new substitution does not affect the capacity of the
myoglobin to transport oxygen throughout the body. In evolutionary terms, what would you expect
to happen with this new mutation (called “neutral” mutation)?
Using Table 4 below, estimate the degree of similarity that exists between you and each of the
animals whose myoglobin you have investigated. Calculate the degree of similarity by assuming
that all species have 152 amino acids. Determine the number of amino acids that are in common
with humans (from Table 2) and divide this number by 152. By multiplying by 100, you can convert
this ratio into a percentage. Enter the numbers in the table.
Species No.
Species Name
Human (Mammal
Chimpanzee (Mammal)
Slug (Mollusc)
Pig (Mammal)
Otter (Mammal)
Sea Lion (Mammal)
Porpoise (Mammal)
Kangaroo (Mammal)
Whale (Mammal)
Elephant (Mammal)
Platypus (Mammal)
Mouse (Mammal)
Penguin (Bird)
Tuna (Fish)
Carp (Fish)
Shark (Fish)
Chicken (Bird)
Table 4
Amino Acids in Common
with Humans
Amino Acids in Common
with Humans (percentage)
Based on this analysis and assuming that the myoglobin gene is representative of your entire
genome, how much of your genome (percentage) is identical to that of the following (put %
underneath species)?
In light of the above, can you provide an explanation why pigs might be a solution for transplant
organs knowing that the closer the genetic similarity the greater the possibility that organs would not
be rejected following transplant?
Given the deletions highlighted in Exercise No. 2, discuss how you would interpret these data with
respect to the relatedness among the species.