EXPERIMENT SIX: DNA SEQUENCING

advertisement
EXPERIMENT SIX:
DNA SEQUENCING
The objectives of this ex periment are to:
(1) carry out a sequencing reaction using a pCR2.1-amplicon clone
(1) electrophorese sequencing reactions to obtain the nucleotide sequence of this clone.
(3) assemble and analyze the sequencing data.
(4) use NCBI BLAST to determine the identity of the sequence and uncover sequence
similarities with other genes in the databases.
(5) create in silico plasmid maps based on sequence data
Experimental expectations:
75
DNA Sequencing
Materials and Equipment Required
1. Sequencing reaction
PCR thermocycler (ABI 2600) and tube tray
2 thin-walled PCR tubes
ice bath
Reagents Required
1. Sequencing reaction
4 µL Big Dye Terminator buffer
2 µL Primer (use M13-Forward -20 &-40
and M13-Reverse)
2. Reaction clean-up
Heat block at 90 & 95oC
ice bath
Gel electrophoresis DNA sequencing
apparatus
2. Reaction clean-up
40 µL Ethanol/sodium acetate mix
125 µL 75% Ethanol
2 µL Loading buffer
Procedure
PCR Reaction (Cycle Sequencing) – for each primer/template combination to be used
1. Prepare a 10 µL PCR reaction as follows in a thin-walled PCR tube on ice:
a. 4.0 µL Big Dye Terminator buffer
b. 2.0 µL Primer (0.8 pmol/µL concentration; 1.6 pmol final amount)
c. 4.0 µL plasmid to be sequenced or pGEM control (2 controls are prepared per gel)
2. Place in a thermocycler set to the following program:
a. 96°C for 60 seconds
b. 96°C for 10 seconds
c. 50°C for 5 seconds
d. 60°C for 4 minutes
Repeat b-d for 24 additional cycles, then hold at 4°C until removed from machine. May be
stored at –20°C until ready to use.
Reaction Clean-up/Precipitation (allow 1.5-2 hours)
1. Add 40 µL of ethanol/sodium acetate mix* to each tube containing PCR products.
2. Close tubes and vortex briefly.
3. Incubate at room temperature for at least 15 minutes (do not exceed 24 hours).
4. Microfuge at 14, 000 rpm for 20 minutes (use double carriers to prevent tubes from falling
through!).
Hint: Place tube hinge outward and use carrier tubes to hold the small tube. Look for the
pellet on the side of the tube underneath the hinge.
5. Immediately remove the supernatant (if there is a delay, re-centrifuge for 2 minutes).
6. Add 125 µL of 75% ethanol.
7. Close tube and vortex briefly. One may stop here to store tube at 4°C overnight.
8. Microfuge at 14, 000 rpm for 5 minutes with hinge outward.
9. Remove supernatant completely, taking care not to dislodge or discard the pellet (where would
you expect the pellet to be?).
10. Dry at room temperature for 45-60 minutes with lid open.
11. Proceed to Resuspension Procedure or store the dried product at –20°C until ready to use.
*Ethanol/sodium acetate mix: 1.5 µL 3M sodium acetate, pH 4.6; 31.25 µL 95% ethanol; 7.25 µL sterile
deionized water
76
Resuspension
1. Resuspend dried pellet in 2 µL sequencing gel loading buffer.
Loading buffer is prepared using a ratio of 5 parts deionized formamide to 1 part 25 mM
EDTA, pH 8, with blue dextran (50 mg/ml).
2. Pulse vortex to mix (or mix with micropipette)
3. Heat samples at 95°C for 2 minutes to denature.
4. Immediately place on ice until ready to load gel.
Gel Loading (allow 30 minutes)
Load 2 µL onto gel comb as instructed.
Primers for sequencing reaction
These are standard primers found in most cloning vectors used in molecular biology. The sequences are provided
below:
M13 Forward (-20)
GTAAAACGACGGCCAG
M13 Forward (-40)
GTTTTCCCAGTCACGA
M13 Reverse
CAGGAAACAGCTATGAC
77
Computer Lab
Sequence Analysis and Database Search
Project Goals:
In this lab, you will be introduced to some of the software and web resources available to
the molecular biologist. The software package we will be using today is called Lasergene and is
produced by a company called DNASTAR (they have provided us with a free site license!). This
package will allow us to examine the raw data generated from the DNA sequencer from each of
our sequencing reactions. By comparing the sequences and examining the electropherograms,
we hope to build an accurate consensus sequence that may be used to search for our gene in the
database. The major goals for today’s lab are to (1) build an accurate consensus sequence using
a technique called multiple sequence alignment; (2) use the sequence to search for our gene in a
huge database of sequences held by the National Institutes of Health (NIH) using BLAST (Basic
Local Alignment Search Tool). Your instructor will lead you through this project as an
introduction to the software and the web resources available (be sure to take notes!). In the
next lab, you will be introduced to a new DNA analysis software called DNA Strider and will be
responsible for completing a task in class.
At the end of the day, we hope to use these resources to find:
* the complete sequence (sequence data will not cover the entire region) corresponding to the
genomic copy as well as the cDNA
* what type of protein our DNA insert encodes and its function in yeast
Sequence Analysis and Database Search
DNA Sequence Alignment:
1) Open Lasergene software from alias found in Apple menu (or open from Mac
HD/Applications).
2) Select “Sequencing Project Management” button to open Seqman program (note name of
program in top right corner of screen).
3) Open new project and select “Add sequences” box
a. Go to BIO375 folder in Mac HD and open folder corresponding to this quarter
(e.g., Spring 2003) and open the “sequence files” folder
b. Select all files (Apple-a) and click “Add files” button (all should be placed in
software window).
c. Click “Done” and return to Seqman window.
4) Click “Assemble” button to perform multiple sequence alignment (window on right
should now have status report of sequences included in “contig”
Note: contig is a term that is short for contiguous sequence where several sequences that are aligned can form a
longer sequence due to differences in length. By comparing the sequences that comprise the contig, you are able to
obtain a consensus sequence
78
Consensus sequence
5’
3’
Raw seq
5) Double click on contig of interest (normally the longest contig containing more than one
sequence) to view sequences as shown in above diagram
a. The arrow at the end of each sequence shows the direction of original sequence
input (alignment will reverse orientation to fit into agreement with other
sequences)
b. Examine the associated electropherogram by opening the > arrowhead at left of
each sequence
i. Enlarge view clicking the magnifying glass symbol on control panel
ii. Increase height by sliding the controller to the bottom on the same panel
c. Numbers at sequence name show length and start of sequence used in contig
(some ends may be removed due to poor quality from sequencer).
6) Determine good vs. bad data:
a. Examine peak strength and resolution differences in electropherogram
b. Lower-case letters, letters other than ATGC and gaps indicate potential problems
that require further examination (sequence discrepancy among raw data files)
A=adenosine
R=G or A
S=G or C
H=A or C or T
C=cytidine
Y=T or C
W=A or T
V=G or C or A
T=thymidine
K=G or T
B= G or T or C
N= any
G=guanine
M=A or C
D=G or A or T
(-)=gap
c. As you accumulate more sequence data in region, less errors are identified due to
sample size
7) Alter consensus sequence based upon observation of data in contig so as to remove
regions of low confidence (if not able to visually determine sequence, leave symbol of
ambiguity).
8) Save results:
a. Consensus sequence:
* Go to Contig menu and “Save Consensus” command (save to your zip disk).
Save file as “consensus” in FASTA format (.fas) and Lasergene format (.seq).
b. Contig:
* Go to File menu and save assembly to zip disk.
79
BLAST search:
1) Open Netscape or Internet Explorer and go to http://www.ncbi.nlm.nih.gov/BLAST/
2) Since our sequence is DNA (not protein), perform the BLAST search at the nucleotide
level by following the “Standard nucleotide-nucleotide BLAST [blastn]” link.
3) Go to your saved FASTA format of the consensus sequence and open the file (you may
be required to select the application for this file, select Simple Text or like program)
4) Highlight and copy the DNA sequence only and paste this sequence into the “search”
window in your Netscape/Explorer window for the BLAST search.
5) Click “BLAST!” button to initiate the search.
6) You will be sent to a BLAST search queue - click “Format!” to see the results (may be
delayed before results are presented).
7) Once the search is done, a new window will appear presenting you with all of the “hits”
that your query brought up from the non-redundant nucleotide database.
8) View the score (should be over 200) and E-value (likelihood of this match being found
randomly - should be VERY small) for each match to determine if you can identify the
gene(s) that most closely matches your sequence.
9) Examine the alignment for your top matches and follow one or more of the associated
links to obtain more information about the subject sequence (“sbjct”).
10) Note in your lab notebook the identity of this sequence.
Complete gene recovery:
1) Open a new window in Netscape and go to http://genomewww.stanford.edu/Saccharomyces/
2) Search the SGD by typing in the name of the gene you have identified by BLAST (threeletter name and number).
3) Once you have found the page associated with your gene, examine this page on your own
and write in your notebook the following information:
a. are there introns in this gene and where are they located within the ORF?
b. what are the chromosomal coordinates of the gene?
c. is the gene found on the Watson or Crick strand of the chromosome?
d. what are the ORFs on either side of the gene (Watson or Crick strand)?
e. what is the molecular function of this gene product?
f. what is one of the cellular components?
4) Go to the “Retrieve Sequences” section of this window (upper right side) and open the
window containing the “DNA+1kb up/downstream”.
5) Highlight and copy this sequence.
80
6) Start the program DNA Strider (MacHD/Applications OS9/DNA Strider) and open a new
window for DNA (File/New DNA or Apple-n).
7) Paste the sequence and save this file to your zip disk as “gene name+1kb”.
8) Return to your Netscape window and repeat this procedure to make a new file with the
“coding sequence”.
9) Save this file as “gene name-coding”.
Manipulation of insert sequence:
1) Open “consensus” file (FASTA format) using Simple Text and copy the sequence as
before.
2) Open new window in DNA Strider for “Degenerate DNA” using the File menu and paste
the sequence into this window using the Edit menu (or Apple-v).
3) Go to the “<->” menu and select the “5’ to 3’ DNA” option to convert the degenerate
DNA into a non-degenerate format - save this file to your disk under the name of
“consensus”.
4) Go to the “gene name-coding” file and copy the sequence using the Edit menu (or Applec) corresponding to the first 10 nucleotides starting with ATG (why not more than 10?).
5) Return to the “consensus” file, open Find menu/ “Find” command (or Apple-f) and paste
into this window the nucleotide sequence.
6) Be sure your cursor is at the start of the sequence, then click the “Find” button to search
for the start of your gene’s ORF.
Note: if you are unsuccessful, you may need to search the antiparallel orientation of the
“consensus” DNA (<-> menu / Antiparallel). If the sequence is not found, your
consensus sequence may not be complete. If this is the case, proceed to step 9.
7) Once this sequence is found, try to determine the junction between vector and insert for
both ends of your DNA comparing the “consensus” file with that of the vector sequence
(flanking the T/A cloning site) found in your lab manual.
8) Once this is found, open the “gene name+1kb” file and trim it down to resemble the size
of your insert information obtained from your consensus file.
9) Alternatively, use the primer sequence from the forward and reverse primers that were
used to amplify your insert to define your insert size in the “gene name+1kb”.
10) The process of finding your complete insert sequence will be the active-learning portion
of this experiment. As you are trying to complete this task, keep in mind that there may
be regions that are outside of the coding region in your insert.
11) Once you have determined the complete sequence corresponding to the insert in your
plasmid using the “gene name+1kb”, note the size of the fragment and save this file as
“insert-gDNA” . If your amplicon was from a cDNA template, you will have to use this
file to transform the “gene name-coding” file to generate the “insert-cDNA” (no intron).
12) Once this file is saved, you may leave the Computer Lab (you may need to return to the
computer lab if you were unable to complete the task during class).
81
Computer Lab
Computer Plasmid Map Project
Project Goals:
The purpose of this project is to expose you to computer software tools that are commonly
used for DNA analysis in molecular biology research labs. During this project, you will use a
program called DNA Strider to accomplish the following:
1) Create two versions of a recombinant plasmid by ligating the insert sequence into the
MCS site of the plasmid pCR2.1-TOPO, in silico.
2) Use the sequence above to design a restriction mapping experiment to test the orientation
of the insert in the actual plasmid you constructed in the lab.
Project Outline
(this is to be done alone if there are enough computer stations)
Ligation in silico:
Preparation of the empty vector sequence:
1) Start the DNA Strider program .
2) Open the pCR2.1 sequence that is found in the BIO375 folder (same folder from which
you obtained the raw sequence data).
3) Find the site used in our T/A cloning procedure by searching for the upstream vector
sequence using the information provided in the pCR2.1 map. In the “Find” window
(apple-f or go to find menu), type in the sequence immediately upstream of the T/A site
(see lab manual) and click OK.
4) After you find this site, highlight the “TA” sequence that was used as the cloning site for
TOPO TA cloning. Capitalize this region to set it apart from the lower-case sequence of
the vector by going to the Edit menu and selecting “UPPER CASE” command.
5) Go to <-> menu (also known as Convert menu in older versions of the software) and
select “Circularize” command to make the DNA a circular entity rather than a linear
piece of DNA (you should see “circular” appear and replace “linear” in the top right
corner of your Strider window).
6) Go to file menu and save this file using “save as” command. Save the file as “empty
pCR2.1” on your Mac zip disk.
82
Preparation of the 2 possible orientations of the insert sequence:
1) Open sequence file named “insert-cDNA” or “insert-gDNA”, depending upon your
plasmid.
2) Write in the note section of this file (window under the sequence) the orientation of the
ORF (ATG->TAA or TAA->ATG) and save it to your zip disc under the name of
“ampliconOri1”
3) Change the orientation of your insert by selecting the entire sequence (using Edit menu or
Apple-a) and going to the <-> menu to select the “Anti-Parallel” command. This will flip
the orientation of the DNA (resulting in the presentation of the reverse complementary
strand).
4) Alter the note section accordingly and save this file as “ampliconOri2” (using “Save
As…” command in File menu).
Construction of the 2 possible recombinant plasmids from T/A cloning in pCR2.1:
1) Find the T/A cloning site in the file “empty pCR2.1” that was offset from the rest of the
sequence by capitalization (see above) and place your cursor in the cloning site.
2) Go to the “ampliconOri1” file and select all (apple-a) /copy (apple-c) the sequence of
your amplicon.
3) Return to the pCR2.1 window and paste this sequence into the T/A site of the “empty
pCR2.1” file.
4) Make note of the positions of the insert “beginning” and “end” sites and the overall
plasmid size. Write this information in your lab notebook and tell the instructor so as to
be sure of the proper in silico ligation.
5) Save this new recombinant construct as “pCR2.1+amplicon1” (using “Save As…”
command in File menu).
6) Retrieve the original “empty pCR2.1” file and copy and paste the sequence from the
“ampliconOri2” file into the T/A site as described in steps 1-4 (if you were successful,
the position and length should not change from step 4). Save this file as
“pCR2.1+amplicon2”.
7) Now you have two recombinant pCR2.1 plasmids that differ only in the orientation of the
inserted amplicon sequence.
83
Restriction Mapping:
Restriction analysis to determine the orientation of pCR2.1+amplicon
1) Open the “pCR2.1+amplicon1” file and go to Enzyme menu and select the "Restriction
Report” command to display the sequence and all of the potential restriction sites found
in this DNA sequence.
2) Scan through report menu which is broken into 3 parts: (1) sequence displaying all of the
restriction recognition sequences included in this program; (2) site usage list of all
enzymes and the frequency of each site within the sequence (note that “-“ means there
were no sites found); (3) list of restriction enzymes (and recognition sequence) that cut
within the sequence ordered according to frequency.
3) Scroll through the report until you find the third section and examine the enzymes that
cut twice in the sequence.
4) Go to your lab notebook and make note of the start and end position of the insert
amplicon.
5) Scan through the list of two-cutter enzymes and make a list in your lab notebook of all
those enzymes where one of the sites falls within the insert amplicon and the other within
the vector sequence (why would this be necessary?).
6) Close the restriction report window and return to the Strider sequence window for
“pCR2.1+amplicon1”.
7) Go to Enz menu and scroll down to “Enzyme Chooser…” command to select one of the
enzymes on your list of potential twice cutters. Highlight one of your enzymes and click
“OK” button (more than one enzyme may be chosen by holding the shift key when you
select additional enzymes).
8) Return to Enz menu and select “Digestion..” command to cut the DNA with your selected
enzyme.
Note: selection of enzyme may also be done by holding the option key down while
you select the “Digestion..” command. After clicking the “OK” button, digestion
will take place using the highlighted enzyme.
9) Repeat this digestion with each of your selected twice-cutter enzymes and compare the
products from each reaction with those that would be generated using the
“pCR2.1+amplicon2” sequence.
10) Note in your lab notebook which of the possible enzymes would provide you with the
best data for determining the orientation of your insert (hint: you will want to use the
enzyme that gives you fragments that would be easily distinguishable between the two
orientations).
11) For your chosen enzyme (after you have shown your work to the instructor), repeat the
digest for each pCR2.1+amplicon orientation using this enzyme as described above.
12) Go to the Enz menu and select the “Graphic Map” command to view your plasmid with
the restriction sites (and position) indicated.
13) Print 2 copies of this map for each orientation (one copy for your lab notebook and one
copy for the instructor.
84
Use the sequences generated in this lab to answer the following questions. This
assignment must be turned in to the instructor by the end of the lab period.
1. On the graphic print out, circle the enzyme sites that will be cut, write the expected size
of each product generated upon digestion and turn in a copy with your answers to these
questions. (4 pts)
2. What is the exact size (bp) of your new plasmids? (2 pts)
3. What is the exact size (bp) of the insert? (2 pts)
4. List the bp positions of the inserted sequence (only the actual insert sequence – not T/A
site) in your new plasmids (e.g., gene X is located in position 134bp to 1134bp in
construct Y). (2 pts)
5. If you cut both of your plasmids with RsaI in the lab, how many and what size fragments
will you expect to generate? (3 pts)
6. What size fragment(s) would you expect from an RsaI digest if your ligation were
unsuccessful (i.e., there is no ACT1 insert)? (3 pts)
7. What enzyme are you going to use to differentiate between the two possible orientations?
Explain the logic behind using this restriction enzyme (as opposed to other enzymes, like
EcoRI, etc) to determine the orientation and the expected results. (4 pts)
85
Download