Introduction to cladistics

advertisement
Fossils & Evolution
Spring 2012
Due: Tuesday, Feb. 7
Lab Assignment (20 points):
Cladistics
Introduction to cladistics
Cladistics is a method of hypothesizing genealogic relationships among organisms. Like
other methods, it has its own set of assumptions, procedures, and limitations. Cladistics is
now the most widely used method for phylogenetic analysis and classification, because it
provides an explicit and testable hypothesis of evolutionary relationships.
The basic idea behind cladistics is that members of a natural group share a common
ancestor and are "closely related," more so to one another than to members of other
groups. These natural groups, or clades, are recognized by sharing “derived” features
that are not present in distant ancestors or other groups.
There are two basic assumptions in cladistics:
1. Organisms of a clade are related by descent from a common ancestor.
2. Change in characteristics occurs in lineages over time.
The first assumption is a general assumption in evolutionary biology. It essentially means
that life arose on earth only once, and therefore all organisms are related in some way.
Because of this, we can take any collection of organisms and determine a meaningful
pattern of relationships, provided we have the right kind of information.
The second assumption, that characteristics of organisms change over time, is the most
important assumption in cladistics. It is only when characteristics change that we are able
to recognize different lineages or groups. The convention is to call the "original" state of
the characteristic “primitive” and the “changed” state “derived.”
Cladistic methodology
1. Choose the taxa whose evolutionary relationships interest you.
2. Choose an “outgroup” taxon: i.e., a taxon that does not belong to the clade you are
investigating (one that does not possess derived traits shared by members of the clade
you are investigating).
3. Determine the characters to be included in the analysis and examine each taxon to
determine the character-states for each character. All taxa must be unique.
4. Group taxa by shared derived characteristics, not original or "primitive" characteristics.
5. Work out conflicts that arise by some clearly stated method, usually parsimony.
6. Build a cladogram (tree) following these rules:
 All taxa go on the endpoints of the cladogram.
1


All cladogram nodes must have one or more derived characteristics that are
present in all taxa above the node (unless the character is later modified).
All derived characteristics appear on the cladogram only once unless the character
state originated more than once by convergent evolution.
Parsimony
In cladistics, the most parsimonious classification is the simplest one and the one
judged to be best supported by data. The simplest classification of a group of taxa is the
one that requires the fewest number of transformations from “primitive” character-states
to “derived” character-states. For example, consider the following two cladograms
(Figure 1). In Cladogram A, amphibians and reptiles are more closely related to each
other than either is to fish because both possess limbs, a “derived” trait that evolved in
the ancestor of amphibians and reptiles. Cladogram B shows an alternative hypothesis in
which fish and amphibians are more closely related to one another than either is to
reptiles. According to Cladogram B, however, the “derived” trait limbs evolved twice,
once in the ancestor to amphibians and a second time in the ancestor to reptiles.
Cladogram A is the more parsimonious hypothesis because it involves only a single
origin of the “derived” trait limbs.
fish amphibians reptiles
fish amphibians reptiles
origin of limbs
A
B
Figure 1.—Cladograms showing two alternative hypotheses for relationships among fish, amphibians and reptiles.
In PAST software, the acronym “MPT” stands for “most parsimonious tree.” The most
parsimonious tree is the one whose length is shortest. Tree length is calculated as the
total number of character-state transformations (steps) required to produce a given
arrangement of taxa in a cladogram. In the example above, Cladogram A has a shorter
tree length than Cladogram B. When dealing with large data sets, multiple MPTs are
possible.
Assignment—Part 1 (Cladistic analysis of chocolate bars)
1. Begin by working in groups of four. Each group will receive a collection of five
“species” of chocolate bars. Construct a data matrix for cladistic analysis by determining
character-states for each “species” for the following characters (fill in the following
table):
Color: 0 = white; 1 = dark
2
Caramel: 0 = absent; 1 = present
Nougat: 0 = absent; 1 = present
Peanuts: 0 = absent; 1 = present
Almonds: 0 = absent; 1 = present
DATA M ATRIX
color caramel nougat peanuts
almonds
Hershey’s White
(outgroup)
Hershey Bar
Hershey’s Almond
Snickers
Snickers Almond
2. Work in pairs once the above table has been filled in. Create and save a file in PAST
that contains the same information as the table you’ve just filled in. [Rename rows and
columns. Note that PAST does not accept spaces, so use abbreviations.]
3. Create a cladogram. Select all data by clicking the gray button in the upper left corner
of the data matrix. Choose “Parsimony analysis” under the Cladistics menu. Use the
“Exhaustive” algorithm; “Fitch” optimization; 5 reorderings; 0 Bootstrap replicates).
Click the “Go” button to produce the cladogram. [Exhaustive search means that the
program will examine all possible trees in order to find the shortest. This is okay for
small data sets, but it is impractical for more than about 6 species. There are over
600,000,000 possible trees for a data set of 12 taxa. Fitch optimization means that the
program makes no assumptions about character-state transformations: i.e., a character can
change from 0 to 1 just as easily as from 0 to 2 (order is unimportant); and a character
can change from 0 to 1 just as easily as from 1 to 0 (reversals are acceptable).]
How many MPTs did the program find? _______________________
What is the Tree Length of these MPTs? _______________________
Notice that the “Parsimony analysis” window tells you how many trees were evaluated.
The “Cladogram lengths” window contains a histogram showing the number of trees of
various lengths (e.g., 10 trees that were each 11 steps; 4 trees that were each 10 steps,
etc.).
How many trees were evaluated __________?
4. Print and label copies of the histogram and MPTs to examine and turn in to the
instructor. [It may be necessary to copy and paste into another application such as MS
Word or MS PowerPoint. Label each as “Tree 1,” “Tree 2,” etc., as necessary.]
3
5. Examine each of the MPTs. On the trees, label just below the nodes to indicate the
character-state transformations at each. The sum of character-state transformations should
equal the Tree Length.
6. Are there any instances of convergence (discuss briefly)?
7. Which of the trees do you favor personally (discuss briefly)?
Assignment—Part 2 (Cladistic analysis of living and extinct vertebrates)
0
1
1
1
1
1
1
1
1
0
1
1
0
1
1
1
1
1
0
1
1
0
0
1
1
1
1
4
0
0
1
0
0
1
1
0
0
0
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
0
0
0
1
0
0
0
1
0
0
0
0
0
0
0
0
1
0
0
0
1
0
0
0
0
0
1
1
0
0
0
0
0
0
0
1
1
placenta
3 middle ear bones
synapsid opening
long arms
3-toed foot
hole in hip socket
palatal opening
amniote egg
limbs
jaws
Lamprey (outgroup)
therapsid (extinct)
Tyrannosaurus rex (extinct)
trout
frog
turtle
sparrow
kangaroo
mouse
antorbitol opening
In this exercise you will perform a cladistic analysis of jawed vertebrates, including both
living and extinct forms. Notice that this analysis includes the jawless lamprey as an
outgroup. Examine the following data matrix in which a character is scored “1” if present
and “0” if absent:
0
0
0
0
0
0
0
0
1
8. Create and save a file in PAST that contains the information in the table you’ve just
examined. [Rename rows and columns. Note that PAST does not accept spaces, so use
abbreviations, if necessary.]
9. Create a cladogram. Select all data by clicking the gray button in the upper left corner
of the data matrix. Choose “Parsimony analysis” under the Cladistics menu. Use the
“Exhaustive” algorithm; “Fitch” optimization; 5 reorderings; 0 Bootstrap replicates).
Click the “Go” button to produce the cladogram.
How many MPTs did the program find? _______________________
What is the Tree Length of this (these) MPT(s)? _______________________
How many trees were evaluated? _______________________
10. Print a copy of the cladogram.
11. On the printed copy of the cladogram use short horizontal lines to indicate the origin
of the various derived characters. Make sure to label each character. [Notice that the tree
window in PAST allows you to view the distribution of characters among the taxa being
investigated. You can do this by toggling up and down on the arrows next to the word
“Characters” in the grey area at the right of the pop-up window.] Turn in the annotated
cladogram to the instructor.
Which derived character defines the clade: sparrow + turtle + T. rex?
_________________________________
How does the turtle differ from the sparrow and T. rex? ____________________
_________________________________________________________________
Which derived character defines the clade: therapsid + kangaroo + mouse?
_________________________________
Assignment—Part 3 (Cladistic analysis of fossil sand dollars)
In this exercise you will perform cladistic analysis on ten species of fossil sand dollars
using data from Mooi et al. 2000 (Journal of Paleontology, v. 74, pp. 263–281). Twentyfour characters have been selected and character-states for each character have been
determined for each of the ten species. The data matrix (Table 1) is thus a 10-row × 24column array. Drawings of the ten species are given in Figure 2.
5
Table 1.—Data matrix (download file at http://faculty.cns.uni.edu/~groves/)
Figure 2.—Drawings of ten sand dollar
species showing variation in shape,
ambulacra, and other characters.
The following is a list of the twenty-four characters and their possible character-states. In
each case, the “primitive” character-state is assigned the value 0 and “derived” states are
assigned the value 1 or 2.
1. Test outline: 0 = circular to slightly elongate; 1 = oval
2. Test height: 0 = highly domed; 1 = flat
3. Test edge: 0 = thick and rounded; 1 = thin and sharp
4. Intestine position: 0 = not within peripheral pillars; 1 = within peripheral pillars
5. Peripheral ballast system: 0 = simple; 1 = complex
6. Ambulacral basicoronal plates: 0 = short; 1 = long
7. Interambulacral basicoronal plates: 0 = small; 1 = large
8. Interambulacral first post-basicoronal plates: 0 = short; 1 = long
9. Peristome size: 0 = small; 1 = large
10. Periproct position: 0 = not in contact with basicoronal; 1 = touching basicoronal
11. Oral interambulacral columns: 0 = discontinuous; 1 = continuous; 2 = partially continuous
12. Oral interambulacral columns: 0 = wide; 1 = narrow
13. Interambulacral columns at edge: 0 = wide; 1 = attenuated; 2 = extremely attenuated
6
14. Ambulacral indentation at edge: 0 = absent; 1 = present
15. Festooned ambulacral lunules: 0 = absent; 1 = present
16. Anal lunule: 0 = absent; 1 = short; 2 = slot-like
17. Ridge around anal lunule: 0 = absent; 1 = present
18. Ambulacral pressure drainage channels: 0 = absent; 1 = present
19. Anal lunule pressure drainage channels: 0 = absent; 1 = present
20. Food groove branching: 0 = simple; 1 = complex
21. Petaloid non-respiratory podia: 0 = present; 1 = absent
22. Number of trailing podia: 0 = many; 1 = few
23. Geniculate spine fields: 0 = absent or poorly developed; 1 = well developed
24. Locomotory spine fields: 0 = indistinct; 1 = distinct
12. Go to http://faculty.cns.uni.edu/~groves/ and save the data file titled “Sand dollars.”
Open the file in PAST. Notice that species’ names have been abbreviated so that
Iheringiella patagoniensis is Iher, Amplaster coloniensis is Ampc, Amplaster ellipticus
is Ampe, and so on. Notice also that Proescutella is the outgroup: i.e., it does not
possess any of the derived traits shared by all other species in the group.
13. Select all data in the data matrix by clicking on the gray button in the upper left
corner of the matrix. Now perform cladistic analysis by choosing “Parsimony analysis”
under the Cladistics menu. Use the following settings: “Branch-and-bound” algorithm;
“Fitch” optimization; 5 reorderings; 0 Bootstrap replicates. Click the “Go” button to
produce a cladogram. Print a copy of the cladogram to examine and turn it to the
instructor. [It may be necessary to copy and paste into another application such as MS
Word or MS PowerPoint.]
(1) How many MPTs did the program generate? ____________
(2) What is the tree length of the MPT? ____________
14. If everything has worked correctly, your cladogram should indicate that
Monophoraster darwini and M. duboisi are more closely related to one another than
either is to any other species in the group. Identify two “derived” character-states that are
present in these two species but not in any other species in the group:
(1) ________________________
(2) ________________________
15. Your cladogram should indicate that the three species of Amplaster are more closely
related to one another than any of the three is to any other species in the group. Identify
the “derived” character-state shared only by these three species:
(1) ________________________
16. Your cladogram should indicate that Leodia sexiesperforata and Mellita
quinquiesperforata are more closely related to one another than either is to any other
7
species in the group. Identify two “derived” character-states possessed by these two
species but not by others:
(1) ________________________
(2) ________________________
17. According to the cladogram, Iheringiella patagoniensis is relatively distant from the
branches that end in Monophoraster and Amplaster species. Examine Table 1 and notice
that I. patagoniensis, Monophoraster spp. and Amplaster spp. all possess long
interambulacral first post-basicoronal plates (character #8, character-state = 1). Explain
how this “derived” character-state could be present in distantly related species:
8
Download