Systematics and molecular phylogenetics lab

advertisement
SYSTEMATICS AND MOLECULAR PHYLOGENETICS
Prelab Reading and Questions
Classifying Organisms
Have you ever noticed that when you see an insect or a bird, there is real satisfaction in giving it a name, and
an uncomfortable uncertainty when you can't? Along these same lines, consider the bewildering number and
variety of organisms that live, or have lived, on this earth. If we did not know what to call these organisms,
how could we communicate ideas about them, let alone the history of life? Thanks to taxonomy, the field of
science that classifies life into groups, we can discuss just about any organism, from bacteria to man.
Carolus Linnaeus pioneered the grouping of organisms based on scientific names using Latin. His system of
giving an organism a scientific name of two parts, sometimes more, is called binomial nomenclature, or "twoword naming". His scheme was based on physical similarities and differences, referred to as characters.
Today, taxonomic classification is much more complex and takes into account cellular types and organization,
biochemical similarities, proteomic, and genetic similarities. Taxonomy is but one aspect of a much larger field
called systematics.
Taxonomic ranks approximate evolutionary distances among groups of organisms. For example, species
belonging to two different superkingdoms are most distantly related (their common ancestor diverged in the
distant past), with progressively more exclusive groups indicated by phylum, class and so on, down to species.
Taxonomists, scientists who classify living organisms, define a species as any group of closely related
organisms that can produce fertile offspring. Two organisms are more closely "related" as they approach the
level of species, that is, they have more genes in common.
Taxonomic Classification of
Man
Homo sapiens
Superkingdom:
Eukaryota
Kingdom: Metazoa
Phylum: Chordata
Class: Mammalia
Order: Primata
Family:
Hominidae
Genus: Homo
Species: sapiens
Carolus Linnaeus was also credited with pioneering systematics, the field of science dealing with the diversity
of life and the relationship between life's components. Systematics reaches beyond taxonomy to elucidate new
methods and theories that can be used to classify species based on similarity of traits and possible mechanisms
of evolution, a change in the gene pool of a population over time.
Phylogenetic systematics is that field of biology that does deal with identifying and understanding the
evolutionary relationships among the many different kinds of life on earth, both living (extant) and dead
(extinct). Evolutionary theory states that similarity among individuals or species is attributable to common
descent, or inheritance from a common ancestor. Thus, the relationships established by phylogenetic
systematics often describe a species' evolutionary history and, hence, its phylogeny, the evolutionary
relationships among organisms
In phylogenetic studies, scientist used to use physical and biochemical characteristcs to draw conclusions about
evolutionary relatedness of organisms. Nowadays, physical, biochemical, genomic and proteomic data is to
draw these conclusions. Scientists then show these evolutionary relationships among organisms through
illustrations called phylogenetic trees.
 Node: represents a taxonomic unit. This can be either an
existing species or an ancestor.
 Branch: defines the relationship between the taxa in
terms of descent and ancestry.
 Topology: the branching patterns of the tree.
 Branch length: represents the number of changes that
have occurred in the branch.
 Root: the common ancestor of all taxa.
 Clade: a group of two or more taxa or DNA sequences that
includes both their common ancestor and all of their descendants.
Questions
1)
What is taxonomy? What is systematics?
2)
What is binomial nomenclature?
3)
What is Phylogenetic systematics?
4)
What do phylogenetic trees show?
5)
What data was used in the past to construct phylogenetic trees? What data is used nowadays? Why is this
more reliable?
Mining Biological Databases on the Internet Lab
Objectives
Your performance will be satisfactory when you are able to
 Locate scientific publications and biological databases of DNA and protein sequences on the Internet
 Retrieve and compare sequence information from databases
 Compare evolutionary relatedness and draw phylogenetic trees from sequence comparisons
Procedure
Part A: Determine the evolutionary relatedness of species through comparisons of protein sequences
1.
You will compare the sequences of the protein hemoglobin from bats, birds, and mammals. Decide whether you
want to do your work with α-hemoglobin or β -hemoglobin. These are the two protein chains that carry oxygen in the
circulatory systems of animals. Both of these proteins have been studied extensively in a large number of species and
should work equally well. Alternatively, you may want to collaborate with a partner and do companion searches, one
doing searches with α-hemoglobin, β -hemoglobin. At the end of the exercise, you can compare your results with
each other to determine whether your different proteins showed the same evolutionary relationships between
species of bats, birds, and mammals.
2.
Go to http://www.uniprot.org/
3.
In Enter search key work, type “alpha hemoglobin” and click submit. The results of this search will
come up on your screen. How many protein sequences were reported to you from this query?
4.
You may scroll down and look through this long list of α-hemoglobin sequences for one from a bat species,
but it may be faster to narrow your search. Go back to the Enter search key work and type “bat alpha
hemoglobin” and click submit. When you get the results of this search, how many sequences of alpha
hemoglobin did you get for bat species?
NOTE: Check the species names and common names for each of the α-hemoglobins in this
sequence report to make sure that they are bat sequences. Sometimes a search won’t recognize the
difference, for example, between “bat” and some other word, such as “wombat”!
5.
Select one bat α-hemoglobin sequence to save to a word document by clicking on the entry for that
protein sequence it is located to the left of the entry name/accession code. An accession code is how
protein sequences are identified and archived in databases. In the case of α-hemoglobin sequences, the
entry name will start with the letters “HBA.” The symbols for all α- hemoglobins will begin with
“HBA.”
NOTE: If you are working with a partner who is doing a companion study with α -hemoglobin,
you will have to collaborate on your selection of which bat α-hemoglobin sequences to save.
6.
The page that opens will contain information about the sequence, such as the taxonomy of the organism
from which it came. Scroll down to the Sequence section and click on the link for FASTA format, this
will give you the protein sequence written with single-letter designations of the amino acids. This is the
best way to save sequence information, because it is a sequence format that all computer search programs
can understand. Click the link and this will bring up a page containing the sequence.
7.
Copy and paste this sequence into a word document. Rename the sequence with an appropriate name.
8.
Return to the web page with the list of bat alpha hemoglobin sequences (clicking twice on the Back button
on the web browser will get you there). Identify another sequence for a bat α-hemoglobin and repeat the
process of highlighting the FASTA formatted amino acid sequence to your word document.
9.
When you have saved two α-hemoglobin sequences from two bat species, repeat steps 3-8 to get 2
sequences from bird species and 2 sequences from mammalian species. It doesn’t matter which species you
choose, as long as 2 are from birds and 2 are from mammals. You might want to choose species that you
think are related to bats. If you are collaborating with a partner searching for β-hemoglobin sequences, your
partner should search for the same species that you have chosen.(Note: make sure that all of your sequences
are the same for comparison, in other words they must all be HBA-1 or alpha subunit 1 and all
approximately the same length).
=> IMPORTANT: Be aware that if you are limiting your search for bird α-hemoglobin sequences with
the keyword “bird,” the search will only locate protein entries where the word “bird” appears. If the
entry was archived under other descriptions such as “hawk” or “eagle” or “penguin,” you will not find
entries using the keyword “bird.”
When you have saved six α-hemoglobin sequences to your word document (two from bats, two from
birds, and two from mammals), go to http://clustalw.genome.ad.jp . CLUSTALW is a computer
program that you can use to search for sequence similarities between many sequences at a time and
display regions of alignment.
11. Copy all of your sequences from the word document and paste it into the entry box in CLUSTALW and
click Submit. (Make sure that protein sequence is selected) Note that the sequence descriptions preceded by
the “>” mark will be copied in with the protein sequences. This will not be a problem with your search.
Without changing any of the default settings on your search, click on the blue colored Execute Multiple
Alignment bar.
10.
12.
13.
The next page will show the alignment of amino acid sequences for the 6 proteins that you have retrieved
from the SWISSPROT database, using the single-letter designations for amino acids. An asterisk will
appear along the bottom row of amino acid alignment at positions where there is an amino acid that is
found in all 6 proteins. These amino acids are said to be highly conserved, since they haven’t changed
since these species diverged from a common ancestor.
a. How many of the amino acids are found to be the same in all of the 6 α-hemoglobin sequences in
your alignment?
b.
What percentage of all the α-hemoglobin amino acids are conserved in all 6 proteins?
c.
Are there any specific regions of the α-hemoglobin sequences that are especially conserved?
Is one end of the molecule more conserved than the other? Describe
your observations.
d.
Are there any amino acids that appear more frequently in conserved regions of the protein than in
the nonconserved regions? If so, which amino acids are they? Go to the table at the end of this
Lab Exercise to decode the single-letter designation for amino acids.
At the top of your CLUSTALW report, you will find the exact percentages of amino acids in the sequence
alignment that are identical when comparing only two sequences at a time. For example, if your report says
“Sequences (1:2) Aligned. Score: 87.2”, this means that when the first two sequences saved on your disk
were aligned, 87.2% of the amino acids were identical in both sequences. Transfer these percentages into a
table format, in which the species whose sequences you have aligned are headers for both the columns and
the rows. Your table should look similar to this:
Table 1: Percent identity in amino acid alignment
SPECIES (bat
#1
100
(bat
name
#1
)
(bat#
name
2)
(bird
Nam
#1
e)
(bird
name)
#2
name)
(mammal
#1 name)
(mammal
#2 name)
(bat
#2
87.2
name
)100
(bird
#1
name)
(bird
#2
name)
(mammal
#1 name)
(mammal
#2 name)
100
100
100
100
Notice that you need not fill out both halves of this table since the information is redundant.
From this table, can you see whether the α-hemoglobin sequences are more similar for bats and birds,
compared with bats and mammals? What does this suggest about the evolutionary relatedness of these
species? Which species diverged from each other the most recently and have the most recent common
ancestor? Which species have been divergent from each other the longest and have the most ancient common
ancestor? From the information in this you should be able to predict that bats are more closely related to
either birds or mammals.
14.
A phylogenetic tree can present the relatedness of species from sequence similarity data such as your Table
1. These trees link species that are more closely related in branches, and the length of the branches is their
evolutionary distance. You can draw a phylogenetic tree from your amino acid alignment report by pairing
species that have the most sequence similarities to make short branches. Species who have fewer sequence
similarities will branch from each other on the tree farther apart.
The CLUSTALW on the page that your report appears on will automatically draw a phylogenetic tree
for you. At the bottom of the page, click on the drop down select tree menu and choose one of the rooted
phylogenetic tree options )check out the unrooted too) Print out or copy the tree that appears on the
screen.
Does the information on this tree agree with your analysis above of the Percent identity in amino acid
alignment for α- hemoglobins table? Explain.
15.
One way to evaluate the validity of the phylogenetic tree that you drew for bats, birds, and mammals is to
compare it with trees constructed from sequences of other proteins. Compare your tree with a tree
constructed by your partner searching for the β-hemoglobin sequences. Does your tree agree with theirs?
Are the relative lengths of the branches the same?
Repeat the comparisons that you made (steps 1-16 above) with other species such as the following:
a. Compare whales to mammals and fish.
b. Compare reptiles to birds and mammals.
Print out or Draw the phylogenetic tree for each of these a write a brief conclusion about your findings.
16.
Systematics and molecular phylogenetics lab helpful hints
“accession
code” = entry
Scroll down to “sequence” and click on FASTA
The file will look similar to this
>sp|P11753|HBA_CYNSP Hemoglobin subunit alpha OS=Cynopterus sphinx GN=HBA PE=1 SV=1
VLSPADKTNVKAAWDKVGGNAGEYGAEALERMFLSFPTTKTYFPHFDLAHGSPQVKGHGK
KVGDALTNAVSHIDDLPGALSALSDLHAYKLRVDPVNFKLLSHCLLVTLANHLPSDFTPA
VHASLDKFLASVSTVLTSKYR
Paste
here
Download