Identifying Polymorphisms in the Steroid Biosynthesis Pathway of Soybean The Steroid Biosynthesis pathway currently has 40 genes matching soybean EST sequences. This pathway is located at: http://www.genome.jp/dbget-bin/www_bget?path:egma00100 YOUR ASSIGNED GENE IS EC# _________ (example given is for EC 1.17.1.2) STEP 1 Click on the hyperlink above to go to this pathway. Find your assigned gene in the pathway (look for green highlighted boxes) and click on it to view the sequences identified in soybean. This representation of the steroid biosynthesis pathway is from the KEGG database. Green highlighted sequences are from soybean, while un-highlighted genes are found in organisms other than soybean. All of the enzymes listed in this pathway are sequenced from transcribed genomic DNA. STEP 2 Look for “Other DBs” and select the “alignment” link (shown in red at the right). Count the number of records ______ (17) Transcribed genes are usually sequenced randomly, thus the number of genes is an excellent indicator of the level of transcription of a particular gene – the more records the higher the transcription rate. If you have only one sequence the alignment option will not be available, go to STEP 5. STEP 3 At the top of this page is a link to the “Contig Graphical Alignment View”. Follow this link to the next page STEP 4 Select the left-most (top) sequence in the Contig Graphical Alignment View and follow the link to NCBI. Example Sequence ID: 58020548 Your Sequence ID:_______________ This alignment shows all of the sequences from soybean identified for this gene. By selecting the left-most sequence you get the mRNA with the most 5’ transcribed sequence. STEP 5 Find the sequence length (circled in red) and subtract 100 from it below. Enter this value after the “50” below.. Length of sequence 933 -100 = 833 50, 833 Enter for YOUR EC #: ___________ Length of sequence ___ -100 = ____ 50, ______ You are selecting the region that will be amplified during PCR. The "50" entered above is the first 50 base pairs of the gene. You will use this information to design a forward primer located in the first 50 bp and a reverse primer located in the last 100 bp of your sequence. This gives you the greatest chance of detecting a size polymorphism using PCR STEP 6 Change the display type to FASTA (the display menu is circled in red the page will automatically change). Copy the sequence in FASTA form by selecting the text on the page as shown. This is the nucleotide sequence of your gene. The (un-transcribed and un-modified) DNA located on a chromosome in soybean will be the target of your PCR reaction, STEP 7 Go to the “Primer 3” program and paste your sequence in the main window as shown: http://frodo.wi.mit.edu/cgi-bin/primer3/primer3_www.cgi Exclude the middle of the sequence by typing 50,833 (in this case the sequence is 933 bp long, so 50, 833 is appropriate). Use the value you computed for your sequence in step 5. Select “PICK PRIMERS” STEP 8 If the program was able to design primers, simply cut and paste them into the report page at the end of this document. If the program could not design primers, select a smaller target (and thus a larger region for primer design) by increasing the left side exclude value and decreasing the right side exclude value in step 7 by 25 bp increments. Pathway: Steroid Biosynthesis Enzyme EC number: Students Name: Date: ___________ ___________ ___________ # of records _____ step 2 Product Size: _____ step 7 Sequence source: ________________ step 4 Paste Primer 3 program data here: Sequences to order: (GmaxEc# _ your initials _ L (L for left, R for right) Example: GmaxEC1.17.1.2_JS_L GmaxEC1.17.1.2_JS_R GATACTGCAACTGGCAACATTT CATGGGTGCAAAGGTATGAA sequence