Determining the sequence of your PCR product The PCR product you amplified from the transformed E. coli is probably part of the GFP gene. Because you used primers specific to GFP, more than likely this PCR product is GFP, however can you be 100% certain that this product is GFP? What would you need to be certain that this amplified DNA is from the GFP gene, and not just a random piece of DNA that happened to amplify with your GFP primers? Your teacher will send the remainder of the PCR to a sequencing facility. Generally the results are back in one to two days. The sequencing facility will only use one of the primers. Why do they only need one of the primers? Read and watch the following to learn how sequencing works: http://www.nature.com/scitable/topicpage/the-order-of-nucleotides-in-a-gene-6525806 http://www.youtube.com/watch?v=AV35C36bBto What is the purpose of the primer? What is the purpose of the dideoxynucleotides? What enzyme is used to carry out the reaction? Why do you need a much greater amount of deoxynucleotides compared to dideozynucleotides for the sequencing reaction? What is used to “see” the different sized pieces? What is your sequence—using BLAST comparison When your DNA sequence is returned to you, it will be in the form of a text sequence. How do you determine that this sequence is GFP? BLAST your DNA sequence Materials: An Internet connection Make a bookmark in your web browser to the NCBI web site:http://www.ncbi.nih.gov Open NCBI, and click on BLAST (currently on the right side 5th link down). Under Basic BLAST click the first option, nucleotide blast (notice you could blast proteins, you could look at the entire genomic sequence of many species as well as many other cool things to explore on NCBI). Open your text document with your DNA sequence and either copy the entire sequence or copy as much of the sequence as you can without coping any Ns. What are the Ns? Paste your sequence in the box labeled Enter accession number or FASTA sequence. Make sure the database is nucleotide collection. If you knew the species you could include that, but just keep this blank for today. Click BLAST. This could take a while depending on how many researchers are using BLAST at that moment. 1. How long is the sequence that was used to search the database? This sequence is called the "query" sequence because you used it to ask a question (or query) of the database. 2. What is the most likely identity of this sequence? Why have you chosen this sequence? If more than one sequence matches, look at the E values to determine the most likely match. E value -The Expectation value or Expect value represents the number of different alignments with scores equivalent to or better than S that is expected to occur in a database search by chance. The lower the E value the more significant the score and the alignment. Next look at the % identity as well as the % query covered. 3. Click on the accession number of sequence you have chosen to be the most likely identity to your sequence. Accession number is the unique identifier given to DNA or protein sequence record by NCBI to allow for tracking different versions of that sequence record. 4. What kind of sequence is this? Look at the Definition 5. What is the organism? 6. How many bases is this sequence? 7. Are there any expressed genes present in this sequence? How do you know? Gene expression includes the processes of transcription (making RNA) and translation (making a protein). Determine if either of these molecules is described in the sequence record. 8. If your sequence has expressed genes, what are they (name of the gene) and what are the functions (what is the function of the protein)? 9. Between what nucleotide numbers is the GFP gene? 10. Find your query sequence in the NCBI sequence, between what nucleotide numbers do you find your sequenced gene? 11. Do you think your sequenced DNA is GFP? 12. Do you think you have a genetic mutation that could cause the DNA to not make a functioning GFP protein? 13. Look at the other genes in this plasmid and see if they provide any reason as to why you do not have GFP expression in your bacteria? 14. Provide alternative explanations/hypothesis as to why you do not see GFP?