Instructions

advertisement
Genomics (BIO 294) Laboratory 1, Spring 2011
GOAL: To become familiar with the Entrez Sequence Viewer and collect some simple statistics
on the Haemophilus influenzae genome.
URL: http://www.ncbi.nlm.nih.gov/sites/genome
PREPARATION: None
ASSIGNMENT: At the end of class, email me the complete laboratory report form.
Entrez Genome is the US National Center for Biotechnology Information (NCBI) central
database listing for genome sequences. In this laboratory, we will explore the genome of the
Haemophilus influenzae bacteria, as a way to gain some familiarity with the Entrez Sequence
Viewer, and to begin thinking about genomes and the information in them.
1. From the Entrez Genome home page, click Prokaryota. This takes you to a table with
over 1300 genomes of bacteria and archaea, listed in alphabetical order. The columns in
the table detail a variety of information about each genome, including Size (in
megabases) and the date when each sequence was Released.
2. We will be exploring the first genome of a cellular organism to be completely
sequenced—that of Haemophilus influenzae. Scroll down till you find the Haemophilus
influenzae genome with the earliest release date. On your report sheet, note when its
sequence was released. Now click on the NC ID for that genome (in the eleventh column
of the table).
3. This takes you to a page for that genome, with some summary statistics on the genome.
On your report, note the length of the genome in nucleotides, the number of genes, the
number of protein coding genes, and what percentage of the genome is non-coding DNA.
What do you think some of non-protein-coding genes may be?
4. At the bottom of the table, on the right, is a graphical representation of the entire circular
chromosome, and a map of the first 12,000+ bases of the chromosome, starting from
nucleotide number 1. What feature do you think is found at the first nucleotide of a
bacterial genome?
5. Below this, it says click here for Sequence Viewer presentation. Click there.
6. Once the Sequence Viewer loads, you should see a top panel with an overview window
of the entire chromosome, and a bottom panel showing the first 12,000+ bases. At the left
side of the top overview panel is a slider that defines the location of the lower window on
the chromosome. At the beginning of the laboratory, I will have assigned each group a
location on the chromosome to begin their analysis. Click and drag on the top, light-blue
part of the slider to position it over the location you have been assigned. Try to position
the left border of the slider as close as possible to your location, but it doesn’t have to be
1
exact to the single base. Note the location of the first base in the lower window, once you
have found your assigned location.
7. In the lower panel of the Sequence Viewer, genes are displayed as parallel green and red
bars. The green bar represents the RNA sequence of the coding segment of that gene, and
the red bar represents the polypeptide coded for by that coding segment. The white
arrows within the bars indicate the direction of transcription and translation. If you mouse
over the green bar (without clicking), an info box will pop up showing (among other
things) the number of the bases where the coding segment starts and stops, and the length
of the coding segment.
8. Now fill in the table on the second page of your report, with one line for each gene
starting with the first gene that begins after your assigned location (in other words,
skipping any genes that are only partially within the Sequence Viewer on the left side)
and continuing for 20,000 bases. There are two ways to pan from left to right in the
Sequence Viewer: (1) click repeatedly on the right arrow by the zoom tool on the left side
of the Sequence Viewer, or (2) click and hold within the sequence viewer, and drag to the
left to pan to the right (like you would within Google Maps).
9. Once you have filled in as many lines as there are genes in your 20,000-base segment,
complete the calculations on the last line of the table.
10. Move your mouse to the upper-left corner of the data table and click on the little box that
appears with four tiny arrows in it. This should select the whole table. Copy and paste
this into an email to me (jbening@sbu.edu), and send it so I will have your data to
combine with data from other groups.
11. Now click on the Sequence button at the top left of the lower Sequence Viewer window.
This will bring up a Sequence View (positive strand) pop-up window. The numbers at the
left of each line give the base number of the first base on that line. Expand the Sequence
View window so there are 100 bases on each line.
12. At the top right of the Sequence View window are three buttons: up arrow, maximize,
and close (X). Click on the up arrow to minimize the Sequence View window. It should
now be just a grey bar, and you should be able to see the lower panel of the Sequence
Viewer that was behind it, and there should now be a black bar at the top of the lower
panel, indicating the part of the chromosome displayed in the Sequence View. Does the
Sequence View window begin between two genes or within a gene?
13. Click the down arrow at the right of the minimized Sequence View window to bring that
window back up. Within that window, coding segments are shaded in pink, and every
other codon is underlined in blue. The blue letters underneath each line of nucleotide
sequence indicate which amino acid is coded for by that codon, using the single-letter
abbreviations for each amino acid.
2
14. Find the first gene beginning within the Sequence View window that is transcribed from
left to right. What is the first codon of the coding segment, and which amino acid is
coded for? Is this familiar? What is the last codon of the coding segment, and why is
there no amino acid coded for by it?
15. Now look at one of the genes that transcribes from right to left. Does it start with the
same amino acid (at the bottom right end of the coding segment) as the genes that
transcribe from left to right? What codon codes for it, and why is it different from the
codon in the genes that are transcribed from left to right?
16. For easier examination of genes that transcribe from right to left, you can hit the Flip
Strands button, which will show you the complementary strand of the DNA double helix,
reading it in the opposite direction. Those genes will then display as being transcribed
from left to right, and the nucleotide numbers will be negative.
3
Download