Genomics (BIO 294) Laboratory 1, Spring 2011 GOAL: To become familiar with the Entrez Sequence Viewer and collect some simple statistics on the Haemophilus influenzae genome. URL: http://www.ncbi.nlm.nih.gov/sites/genome PREPARATION: None ASSIGNMENT: At the end of class, email me the complete laboratory report form. Entrez Genome is the US National Center for Biotechnology Information (NCBI) central database listing for genome sequences. In this laboratory, we will explore the genome of the Haemophilus influenzae bacteria, as a way to gain some familiarity with the Entrez Sequence Viewer, and to begin thinking about genomes and the information in them. 1. From the Entrez Genome home page, click Prokaryota. This takes you to a table with over 1300 genomes of bacteria and archaea, listed in alphabetical order. The columns in the table detail a variety of information about each genome, including Size (in megabases) and the date when each sequence was Released. 2. We will be exploring the first genome of a cellular organism to be completely sequenced—that of Haemophilus influenzae. Scroll down till you find the Haemophilus influenzae genome with the earliest release date. On your report sheet, note when its sequence was released. Now click on the NC ID for that genome (in the eleventh column of the table). 3. This takes you to a page for that genome, with some summary statistics on the genome. On your report, note the length of the genome in nucleotides, the number of genes, the number of protein coding genes, and what percentage of the genome is non-coding DNA. What do you think some of non-protein-coding genes may be? 4. At the bottom of the table, on the right, is a graphical representation of the entire circular chromosome, and a map of the first 12,000+ bases of the chromosome, starting from nucleotide number 1. What feature do you think is found at the first nucleotide of a bacterial genome? 5. Below this, it says click here for Sequence Viewer presentation. Click there. 6. Once the Sequence Viewer loads, you should see a top panel with an overview window of the entire chromosome, and a bottom panel showing the first 12,000+ bases. At the left side of the top overview panel is a slider that defines the location of the lower window on the chromosome. At the beginning of the laboratory, I will have assigned each group a location on the chromosome to begin their analysis. Click and drag on the top, light-blue part of the slider to position it over the location you have been assigned. Try to position the left border of the slider as close as possible to your location, but it doesn’t have to be 1 exact to the single base. Note the location of the first base in the lower window, once you have found your assigned location. 7. In the lower panel of the Sequence Viewer, genes are displayed as parallel green and red bars. The green bar represents the RNA sequence of the coding segment of that gene, and the red bar represents the polypeptide coded for by that coding segment. The white arrows within the bars indicate the direction of transcription and translation. If you mouse over the green bar (without clicking), an info box will pop up showing (among other things) the number of the bases where the coding segment starts and stops, and the length of the coding segment. 8. Now fill in the table on the second page of your report, with one line for each gene starting with the first gene that begins after your assigned location (in other words, skipping any genes that are only partially within the Sequence Viewer on the left side) and continuing for 20,000 bases. There are two ways to pan from left to right in the Sequence Viewer: (1) click repeatedly on the right arrow by the zoom tool on the left side of the Sequence Viewer, or (2) click and hold within the sequence viewer, and drag to the left to pan to the right (like you would within Google Maps). 9. Once you have filled in as many lines as there are genes in your 20,000-base segment, complete the calculations on the last line of the table. 10. Move your mouse to the upper-left corner of the data table and click on the little box that appears with four tiny arrows in it. This should select the whole table. Copy and paste this into an email to me (jbening@sbu.edu), and send it so I will have your data to combine with data from other groups. 11. Now click on the Sequence button at the top left of the lower Sequence Viewer window. This will bring up a Sequence View (positive strand) pop-up window. The numbers at the left of each line give the base number of the first base on that line. Expand the Sequence View window so there are 100 bases on each line. 12. At the top right of the Sequence View window are three buttons: up arrow, maximize, and close (X). Click on the up arrow to minimize the Sequence View window. It should now be just a grey bar, and you should be able to see the lower panel of the Sequence Viewer that was behind it, and there should now be a black bar at the top of the lower panel, indicating the part of the chromosome displayed in the Sequence View. Does the Sequence View window begin between two genes or within a gene? 13. Click the down arrow at the right of the minimized Sequence View window to bring that window back up. Within that window, coding segments are shaded in pink, and every other codon is underlined in blue. The blue letters underneath each line of nucleotide sequence indicate which amino acid is coded for by that codon, using the single-letter abbreviations for each amino acid. 2 14. Find the first gene beginning within the Sequence View window that is transcribed from left to right. What is the first codon of the coding segment, and which amino acid is coded for? Is this familiar? What is the last codon of the coding segment, and why is there no amino acid coded for by it? 15. Now look at one of the genes that transcribes from right to left. Does it start with the same amino acid (at the bottom right end of the coding segment) as the genes that transcribe from left to right? What codon codes for it, and why is it different from the codon in the genes that are transcribed from left to right? 16. For easier examination of genes that transcribe from right to left, you can hit the Flip Strands button, which will show you the complementary strand of the DNA double helix, reading it in the opposite direction. Those genes will then display as being transcribed from left to right, and the nucleotide numbers will be negative. 3