Bio 2 Spring, 2009 Genes and Disease Projects Bioinformatics II: Examining Gene Structures Assignment due next activity: Complete diagrams that show: 3. Description of the Normal Gene and Its Product a. Gene Location: i. Chromosome number and location on chromosome b. Gene structure: i. Diagram of exon/intron sizes and positions ii. Promoter and regulatory sequence information (optional) c. Transcript structure: i. Total length ii. Length of 5’ and 3’ untranslated regions (UTR) d. Translation product i. Number of amino acids ii. Mature polypeptide size iii. Modifications (glycosylation, phosphorylation, etc.) If possible, include “e” also: e. Mature protein structure i. How many subunits comprise the mature protein? ii. How are the subunits organized? iii. Is the protein structure known? If so, the structure can be provided. Last week we walked through the steps to retrieve mRNA and protein sequences and how to begin to define the features of those sequences that are listed above. These are also listed in the Handout Genes and Disease IV that describes what you are to include on your poster. At the end of that exercise you should have learned how to: 1. Determine the length and structure of the mRNA encoded by the normal gene that is affected in the disease you are studying. 2. The primary structure of the polypeptide encoded by the gene you are studying including its length (total number of amino acids). This week we will apply techniques to determine the structure of the gene itself to discover more about the gene that you are studying and possibly more about the structure and function of the polypeptide that is encoded by this gene. Let’s continue with the retinoblastoma example. Bio 2 Spring, 2009 Access NCBI : http://www.ncbi.nlm.nih.gov/ and go the RB1 page that we accessed last by entering retinoblastoma in the nucleotide search window and then clicking on RB1 in humans. a. The first window shows us the gene structure. Now that we’ve look a bit more at transcription you should understand the diagram below better. b. Move your cursor over the links around the map as we did last time. Note that NC_00013.9 leads to gene information. This will show you the length of the gene itself. The RB1 gene is huge! c. To uncover more information about the protein, click on the protein links. You will find links to the protein (i.e. polypeptide) sequence and other information such as conserved domains. Try to follow the conserved domains link and see what you can sift out of that. d. At the polypeptide sequence page you will find information about the polypeptide itself, such as whether the mature protein differs from the original translation product. We’ll talk more about how this occur in class. e. To determine the predicted size of the polypeptide we can make use of any one of many bioinformatics sites. For example, we can use the site ExPasy: http://ca.expasy.org/tools/#proteome. i. This is the portion of their bioinformatics site with tools for studying proteins. The term proteomics refers to the study of proteins on the genomics scale. We can use these tools to study your protein. ii. Click on “Primary Structure Analysis”. iii. Click on the “Compute pI/MW” iv. Copy and paste your sequence into the box (in FASTA format). v. Submit the sequence and the results will be returned quickly. f. Now, each of you should begin the search for your gene and more. i. Ask questions as you go!