Using a Genome browser to observe evolutionary patterns

advertisement
Module 11. Genome annotation: Using a
Genome browser to observe evolutionary
patterns
Background
Species evolve over time. Evolution is the consequence of the interactions of 1) the potential for a species
to increase its numbers, 2) the genetic variability of offspring due to mutation and recombination of genes,
3) a finite supply of the resources required for life, and 4) the ensuing selection by the environment of
those offspring better able to survive and leave offspring. The great diversity of organisms is the result of
more than 3.5 billion years of evolution that has filled every available niche with life forms. Natural
selection and its evolutionary consequences provide a scientific explanation for the fossil record of ancient
life forms as well as for the striking molecular similarities observed among the diverse species of living
organisms. The millions of different species of plants, animals, and microorganisms that live on earth
today are related by descent from common ancestors. Biological classifications are based on how
organisms are related. Organisms are classified into a hierarchy of groups and subgroups based on
similarities which reflect their evolutionary relationships.
National Science Education Standards, p. 185
Once a novel whole genome has been sequenced, the sequence and accompanying information can be
used by the research community more easily through genome and table browsers, such as those at
UCSC. The generic model organism database (GMOD) has been developed for researchers to post
genomes and plot information. We will explore genomes at the UCSC genome browser, which contains
many finished genomes and is supported by a large group of dedicated researchers. Whole genome
sequences from many different species have been aligned by researchers. Now that you’ve gained
some experience in understanding the fine structure of a gene in the previous module, lets see what the
same gene, beta-globin, looks like when compared among many different species. Do you think it is
possible for mutations at a single gene to reveal phylogenetic relationships among vertebrates?
Goal



To give a basic understanding of genes and their functions.
To navigate around a genome, using a genome browser.
To give an understanding of how DNA comparisons reflect evolutionary history and gene
function.
V&C Competencies
1) Ability to apply the process of science: Observational strategies, Hypothesis testing
2) Ability to use quantitative reasoning: Developing and interpreting graphs, Applying statistical
1
methods to diverse data
4) Understand the interdisciplinary nature of science: Chemistry of molecules and biological systems
Protocols
1. Go to the UCSC genome browser web page at genome.ucsc.edu, select Genomes, and maximize
the window.
2. Select the clade “Mammal”, genome “Human”, clear all text from “position or search term,” and
enter the abbreviation for human beta globin “HBB” as a search term. Wait a moment and you
will get a list of search results. Select HBB: Homo Sapiens Hemoglobin Beta.
3. You will now see a close up view of the beta-globin gene in humans. Some sections on your
screen may look different than below, but the top should be similar. Find the browser
navigation tools, exact location of the gene on chromosome 11, the sketch of the gene, and
miscellaneous information tracks below the gene sketch.
Navigation tools
Exact location shown
Location on chromosome
View of gene
Miscellaneous Information Tracks
2
4. The beta globin gene is in the reverse orientation in this view of the chromosome (arrows on the
sketch point to the left). Before looking deeper, scroll down and reverse your orientation of the
gene so that it matches the orientation of the gene from the hand-annotation exercise you
performed earlier. (Worksheet 2A, Question 4)
5. Scroll up to the picture and see if you can reconcile the genome browser’s gene sketch with your
hand sketch of this gene. How are UTRs, coding sequences, and introns depicted in the
browser? Roughly redraw the sketch below, and label the major pieces.
6. Scroll down to the Comparative Genomics Track controls and adjust the settings so that
“conservation” is set to “full.” This will give your browser the most expanded view of the
alignment among many different species, which allows you to see how similar (i.e. conserved)
each nucleotide is among many different vertebrates. Then hit “refresh” and scroll back up to
the view the changes.
7. Under, “Multiz alignment of 46 (your number might differ) species” you will see vertical bars
corresponding to each nucleotide location in the genome. The taller the bars, the more
conserved the DNA sequence is at that location based on pairwise comparison of the human
versus the species identified on the left of the screen.
3
8. To complete the lab, continue from this point to answer the questions on the “Evolution
Worksheet” below.
Assessment
Worksheet: Annotation
Name: ________________________
1. Which species on your screen looks most similar to the human sequence? Does that make
phylogenetic sense? Explain.
2. Does it look like some regions of the gene are more conserved than others?
a.
What evidence supports your answer?
b. Which regions of the gene appear more conserved?
c. Why might that be the case?
3. Zoom in all the way to the “base level” from chromosome 11:5,247,991-5,248,191. You can do this
by entering: “chr11:5,247,991-5,248,191” in the search window. Alternatively, you can center the
4
view on the first intron/exon boundary by “double-click-drag” and zoom to base view. You should
be looking at a view of the first exon/intron boundary in forward orientation. You only need to see
the first boundary.
a. Does this intron/exon border follow the GT/AG “rule”? Explain.
b. You should see nine species in the browser viewer, with zebrafish (Danio rerio) on the
bottom. If the alignments are hidden, click on the text “Multi-z Alignment of 46
vertebrates,” to expand. What nucleotides (A,G,C,T) do you observe in each of the first four
positions of the intron over all nine species? List them in the table below. For each
position, what fraction of the nine species are identical (to human)? Ignore gaps in your
calculations.
1
Position in Intron
2
3
4
Nucleotides
Observed
% identity
Among all
Species
c. What could explain the variation in conservation among the four positions that you see in
(b)?
5
4. Write down the amino acid sequence for the six amino acids right before the intron begins.
a. Fill in the box below to calculate the fraction of 6 sites for which are all species are identical, and
then repeat it just for mammals. Fill in the amino acids found at each position and the percent
identity among the relevant species. Also calculate the percentage of the 6 sites that are
biochemically “similarity” among all species, and among only mammals.
Position
1
2
4
3
5
6
% of the 6 sites that
are identical among all
spp, or similar among
all spp
6
% of the 6 sites that
are identical among
mammals or similar
among mammals
Amino Acids
Observed
Over All
Species
Identical? (Y/N)
Similar? (Y/N)
Position
1
2
4
3
Amino Acids
Observed
Over
Mammals
Identical? (Y/N)
Similar? (Y/N)
b. How do you explain the patterns detected above?
6
5
c. Do you think the patterns you’ve been describing are unique to Beta Globin, or general?
Support your idea by going to the HBD (Homo sapiens hemoglobin delta) gene and repeating the
amino acid analysis above.
Position
1
2
3
4
5
6
% of the 6 sites that
are identical among all
spp, or similar among
all spp
6
% of the 6 sites that
are identical among
mammals or similar
among mammals
Amino Acids
Observed
Over All
Species
Identical? (Y/N)
Similar? (Y/N)
Position
1
2
3
Amino Acids
Observed
Over
Mammals
Identical? (Y/N)
Similar? (Y/N)
7
4
5
Download