Biology 3492 Spring 2008 Laboratory 1-2 ----------------------------------------------------------------------------------------------------------------The Tetrahymena genome database (TGD): getting your gene’s sequence ----------------------------------------------------------------------------------------------------------------This semester we will be working together to study the membrane skeleton of the ciliated protozoan, Tetrahymena thermophila. The membrane skeleton is a protein structure just beneath the plasma membrane and is likely essential for maintaining cell shape and regulating cell functions, such as exocytosis. As a starting point, we will be using proteomic analysis performed by Prof. Jerry Honts (Drake University), which has identified about two dozen protein components of the membrane skeleton (see tables on following pages). We will be his collaborators, studying these proteins during the course. We will use cell and molecular biology techniques to demonstrate that the proteins identified by Dr. Honts are really a part of the membrane skeleton, also called the “epiplasm”. You will each be assigned one gene to characterize during the course. We will use the Tetrahymena genome database (www.ciliate.org) to find our genes’ sequences, download to files for use throughout the course, and design oligonucleotide primers to amplify and clone the genes from Tetrahymena genomic DNA. We will also design PCR primers to each gene to perform rtPCR to determine whether and when these genes are expressed. In this first week of class, we will retrieve our gene sequences and design the primers for this work. So each one will design primers for the chosen gene for both cloning and expression analysis. We will use the Primer3 program (http://frodo.wi.mit.edu/), which matches optimal primer pairs for PCR. We will also become familiar with the program Gene Construction Kit, which we will use to analyze and display our genes’ sequences. A graphic representation of a DNA sequence is an important part of the design of experiments in molecular biology. One needs to be familiar with the overall structure/layout/features of the DNA of interest to enable future studies. Steps: 1. Getting your gene sequence. Go to TGD and download the sequence of your gene of interest. --Download your: 1) coding sequence 2) ORF translation 3) at least a 5 kbp region spanning your gene sequence (larger if this does not include your whole gene. -- Paste each sequence into a Word file (save your work) -- Paste your genomic sequence (3) into a GCK file MAKE SURE TO SAVE YOUR WORK 2. Annotating your gene sequence. a. Color your coding region Blue b. Find your introns, Compare your Genomic sequence to your coding sequence using the EMBOSS alignment website (http://www.ebi.ac.uk/emboss/align/) The introns will we the gaps in the coding sequence c. Go to the TGD genome browser and look for available cDNA sequences. Download these and use to confirm the predicted introns. Color predicted introns RED, confirmed introns GREEN. d. Mark the following restriction sites on your sequence: ApaI, BamHI, BglII, BsrGI, EcoRI, EcoRV, HincII, HindIII, NotI, PstI, SacI, SalI, ScaI, XbaI, XhoI 1 Biology 3492 Laboratory 1-2 3. Designing oligos to clone your gene sequence. Spring 2008 To get started, one needs to first PCR amplify one’s desired coding sequence and clone into an entry vector such as pENTR-D. When using the pENTR-TOPOD cloning kit, 4 nucleotides – CACC- must be added to the primer in frame with the ATG start codon (plus the six nucleotides upstream of the ATG start codon sequence to ensure good translation of the mRNA). This will allow directional cloning into the pENTR-D vector. your favorite gene- PCR amplify and clone into pENTR-D-TOPO yfg coding sequence CACC XXX XXX ATG................................................................................................GATATC -6 +1 a. Paste in the your coding sequence into the primer3 dialog box. Set the size of your PCR product to amplify between the size of your coding sequence and this minus 50 bases. b. Adjust parameters as needed to get appropriate oligos. The upstream oligo must start at – 6 plus have a CACC added to the 5’ end. The downstream oligo must start at the last base of the last codon and be designed anti-sense to the gene. A GATATC should be added to its 5’ end. c. To design primers for expression analysis, if possible locate an intron in your gene and design them to span the intron. Paste this region of your sequence into the Primer3 dialog box. Set size range of your product to be between 170 and 250 bp (not counting the intron). Select primers. d. Copy all primer sequences into your word file and email to Prof. Chalker (dchalker@wustl.edu). MAKE SURE TO SAVE YOUR WORK Accessing Bio3492 folder on the NSLC server (to use to turn in computer files): In ‘Finder’ select ‘GO’:’Connect to Server’ Select ‘Local’ folder: Select ‘NSLC Server’: Connect’: logon as ‘biol3492’ password: ‘epiplasm’ Create a folder of your files on your computer’s desktop; before ending, be sure to Click and drag your folder into for future use. 2 Biology 3492 Spring 2008 Laboratory 1-2 Some GCK function commands you will need: Make codon table for Tetrahymena Open file; Construct: features :edit codon table: new codon table, Change TAA and TAG stop codons to Gln (glutamine); name ‘Tetrahymena’ ; ‘OK’ Select codon table for Tetrahymena Open file; Construct: features :select codon table: Select ‘Tetrahymena’ ; ‘OK’ To show numbered coordinates of the sequence Open file Construct: Display: Show Positions To group sequence for friendly display Select sequence; Format:Grouping:(select ‘by threes’ for protein sequence, ‘by tens’ for others) To color a block of sequence Select sequence; Format:color:(select color) To designate a region Select sequence; Construct: features :Make region type in name, designate as protein if desired To change regions arrow display to line or different arrow Select region; Format: lines: (select line type) To mark restriction sites deselect all sequence; Construct: features :mark sites in popup window, select enzyme names one at time and ‘add’ to list, click ‘OK’ To mark a particular position (e.g. to indicate an oligo position) Select cursor location at site; Construct: features: mark location To insert a particular sequence into an existing file (e.g. to indicate an oligo position) Select sequence to be inserted and copy Place cursor over at site to insert sequence and paste sequence to this location Notes: By highlight a region of sequence in target file, the highlighted sequences will be removed during operation Sequence can be inverted by ‘special paste’ To space restriction site markers Select sites in graphic display Format:Sitemarkers:automatic arrangement (or just drag individually) To generate list of restriction sites in your sequence deselect any sequence; Construct: features :list sites in popup window, select enzyme names one at time (or ‘add all’ and ‘add’ to list, click ‘OK’ 3 Biology 3492 Laboratory 1-2 Spring 2008 Database and analysis tools links Tetrahymena Genome Database www.ciliate.org Eric Cole’s: biology of Tetrahymena website http://www.stolaf.edu/people/colee/ Sequence analysis tools at EMBL-EBI http://www.ebi.ac.uk/Tools/sequence.html EMBOSS pairwise alignments http://www.ebi.ac.uk/emboss/align/ ClustalW sites for multiple sequence alignments http://www.ebi.ac.uk/clustalw/ http://clustalw.genome.jp/ [Easier to download trees] NCBI Blast homepage http://www.ncbi.nlm.nih.gov/BLAST/ Pfam database of conserved protein motifs http://pfam.wustl.edu/ 4