NAME ____________________________________________ GENETICS (BIO 306) RETRIEVING INFORMATION FROM BLAST SEARCH RESULTS Supply the following information about your fly genes. You may submit a separate document for each gene, or include both on one document. 1. Write the name of your fly gene. 2. In your own words, describe the putative normal function of the gene (i.e., of the wild-type allele). Include a properly formatted reference for the journal article that describes this function (see links on the assignment website for proper formatting guidelines). 3. Conduct a nucleotide BLAST search using your wild-type gene sequence. Exclude Drosophila (taxid:7215) [Note: if your initial search returns an error message of “No significant similarity found,” repeat the search using discontiguous megablast rather than megablast as your search algorithm. Some genes (e.g., white) require that you increase the number of results (try 500) in order to get a useful result.]. Write the scientific and common names of the top nonDrosophila hit obtained from the BLAST search [Note: Sometimes the common name will not be readily available. If not, click on the link from the list of hits, and once the locus information comes up, click on the link next to the word "ORGANISM". This will take you to the taxonomy of the critter and the common name will be listed]. 4. Write the query cover %, E value, and % identity for the top non-Drosophila hit. 5. Using the Graphic Summary, briefly describe the pattern of overlap between the query sequence and the top non-Drosophila hit. Does the entire hit show strong similarity to the query sequence? If not, which parts of the hit do (provide approximate ranges of nucleotides)? 6. Run a second BLAST search. In the “Choose Search Set” window, under “Organism,” include the following genetic model organisms (you can just copy-paste the information in the left column below to the Organism window, using the “+” button to add lines, one per organism): Escherichia coli (taxid:562) Saccharomyces cerevisiae (taxid:4932) Caenorhabditis elegans (taxid:6239) Arabidopsis thaliana (taxid:3702) Danio rerio (taxid:7955) Mus musculus (taxid:10090) Homo sapiens (taxid:9606) [bacterium] [yeast] [nematode] [rock cress (plant)] [zebrafish] [mouse] [human] [Note: You may need to use the blastn algorithm rather than megablast or discontinuous megablast in order to obtain any hits]. In the table below, provide the percent similarity, percent coverage, and E-value for each model organism. What organism yields the top hit to your Drosophila gene? Is this result what you expected? Why or why not? Organism Escherichia coli Saccharomyces cerevisiae Caenorhabditis elegans Arabidopsis thaliana Danio rerio Mus musculus Homo sapiens Percent coverage Percent similarity E-value What is the identity (title/description) of the top-hit gene? Does the top hit appear to have a biologically similar function to your Drosophila gene? Why or why not? 7. Now, obtain the amino acid translation of your gene sequence from the GenBank record for your sequence, and rerun the BLAST search with the same model organisms as above, but this time use a protein BLAST (blastp) search. Do the results (e.g., top hit organism, % coverage, % identity, and E-value) differ significantly from the results of the nucleotide BLAST search? Does your conclusion change about whether the top hit appears to have a biologically similar function to your Drosophila gene? Why do you think this might be the case? Submit your answers to the appropriate dropbox as a Word File (.doc or .docx).