Functional Genomics Bioinformatics Instructions and Worksheet Save this document to your desktop and complete it on your computer! Complete this worksheet in MS Word on your computer. If you have it in print, open it at http://www.dnai.org/media/bioinformatics/ccli/yfptagging/functionalgenomics_ws.doc. If you opened this document in an Internet browser click File, click Save as, and save it to a directory on your C- or A-drives. Then, close the browser, open the document in MS Word, and follow the instructions to answer the questions. In doing so, you will familiarize yourself with bioinformatics routines such as locating and extracting information and sequences about/for genes and proteins from databases. Selecting a sequence for gene tagging Fit this window into the upper left quadrant of your computer screen. Fit the FTFLP site at Stanford into the upper right quadrant of your screen. At the FTFLP web site click Target Selection. Scroll down to Table 2. Find Category 2 and open the tab-delimited list to access a listing of 4000 short-listed A. thaliana genes of unknown function available for tagging. Study the information for the genes and select one for further analysis. Record the information provided for this locus: o _________________________________________________________________ o _________________________________________________________________ o _________________________________________________________________ o _________________________________________________________________ Contact CSHL for further instructions on how to order sequences and how to avoid using sequences that are already being tagged by other labs. (The genes tagged during the course are At1g08380, At1g08480. Gene At3g16240 has been tagged previously and is being used as a control. Primer sequences can be found here.) To learn how to find more information about a gene go to the next slide. Locate a gene in the A. thaliana genome Fit this window into the upper left quadrant of your computer screen. Highlight and copy the Locus name (AT#) for a gene (e.g. At3g16240 for the lab control). Fit NCBI’s Map Viewer into the lower right quadrant of your screen. In Map Viewer click A. thaliana. Paste the Locus name into the Search for window, click Find. The sequence locus is flagged with a red tag. To view the chromosome click the number beneath. Zoom in and determine what genes are located in the same region. _________________________________________________________________ _________________________________________________________________ _________________________________________________________________ 1 _________________________________________________________________ Increase the resolution until you can identify the structure of the gene. Click sv to view the gene sequence. Discern introns and exons. Find sequence information for a gene Fit this window into the upper left quadrant of your computer screen. Highlight and copy the Locus name (AT#) for a gene. Fit the TAIR site into the lower left quadrant of your screen. Under Advanced Search click Genes. Paste the Locus name into the Search by name window, click submit query. Click the link underneath Locus. You can access the sequence information for the coding sequence (CDS), the genomic DNA, and for the deduced protein sequence. To examine the protein go to the next slide. Find information about the protein product of a gene Fit this window into the upper left quadrant of your computer screen. Highlight and copy the Locus name (AT#) for a gene. Fit TAIR site into the lower left quadrant of your screen. On the TAIR site find Advanced Search click Proteins. Paste the Locus name into the Search by name window, click submit query. Click the link underneath Name. The result page contains links to SwissProt, GenPept, and NCBI BLink (3D-structures). To analyze the protein domains highlight the amino acid sequence and analyze it at the SMART website., Also follow the link to The Arabidopsis Subcellular Proteomics Database. Find expression data for a tagged gene Tagged genes are being transferred into A. thaliana plants using Agrobacterium tumefaciens. Transformed plantlets assayed are for gene expression. To view the results, click Browse Images on the FTFLP website. Highlight All_localization and click Go. Click a Gene Name for the sequence of the tagged locus. Click the small green camera icon for a fluorescence microscopy image of a transformant; an Image-Description explains what you see. Match the results (columns) for a tagged gene and associate it with the biology of the protein and/or the organelle it is localized in. 2 Find the gene model for a tagged gene Tagged genes are being transferred into A. thaliana plants via Agrobacterium-mediated transformation. Transformed plants are assayed for gene expression. To view the results click Browse Images on the FTFLP website. Highlight All_localization and click Go. Click a Gene Name for the sequence of the tagged locus. What do black, blue, and red letters denote? Can you discern any pattern that could be used to detect splice sites (exon/intron borders)? _______________________________________________________________ Search databases for proteins that confer the ability to alter light by searching for proteins with amino acid compositions similar to that of GFP. Highlight and copy this amino acid sequence. Open the NCBI Internet site, click BLAST, find Protein, click Protein-protein BLAST (blastp). Paste the sequence into the window, and click BLAST. Record the request ID ________________________________ . Click Format! The E Value is the most meaningful indicator for the quality of a hit; the lower the E Value, the better the hit. Usually, E Values of less than 0.1 indicate meaningful hits. (For further explanations click the link to Blast FAQ in the upper part of the NCBI Blast result page.) Examine the titles for some of the search hits and the related GenBank database entries using the links labeled gi determine what proteins the search yielded. __________________________________________________________________ __________________________________________________________________ __________________________________________________________________ __________________________________________________________________ Search databases for proteins that confer the ability to alter light by searching for proteins with domains similar to those in GFP. Highlight and copy this amino acid sequence. Open the NCBI Internet site, click BLAST, find Protein, click Search by domain architecture (cdart). Paste the sequence into the window, and click Search. Click the red bar labeled GFP. Click of the most diverse members and change it to select from list. Search the column titled Definition for proteins that alter the color of light to something other than green. What other colors are listed? 3 Click on the PDB-ID/gi entry for a representative of each of the different types of flourescent proteins (FP) and try to view the structures by clicking View 3D Structure. How do the structures differ for fluorescent proteins which emit different light? __________________________________________________________________ __________________________________________________________________ __________________________________________________________________ __________________________________________________________________ Search databases for proteins that confer the ability to alter light by searching for proteins with structures similar to GFP. Open the NCBI Internet site. Click Entrez and change it to Structure. Search Structure for 1C4F, click 1C4F. Click the pink bar labeled Chain A. Click Medium redundancy and change it to All sequences. Change Graphics to Table. Click List. Check one representatives from each: cyan, red, green, and yellow fluorescent protein, click View 3D Structure. Click Style, click Coloring Shortcuts, select Aligned. Are the capabilities of the proteins to differentially alter light reflected in structure differences? __________________________________________________________________ __________________________________________________________________ __________________________________________________________________ __________________________________________________________________ 4