DNA Barcoding of Molecular Life Science Center Plant

advertisement
DNA Barcoding of Molecular Life Science Center Plant Species
We often walk past the plants that surround MLSC, but hardly notice them. Over the next few
lab periods DNA will be isolated from these plants, amplified by PCR, and sent to a company for
sequencing. The DNA sequences will then be used to place these plants into a phylogenetic
tree and identify which Family each plant belongs to. Recall from BIOL 211 that living things are
classified by the following levels: Kingdom, Phylum, Class, Order, Family, Genus, and Species.
Related organisms share similar traits, including DNA sequence, thus DNA sequences can be
used to classify living things. Organisms with many similarities will be closely related while
those with more differences will be more distantly related. DNA sequences used for
classification are termed DNA Barcodes, since they are analogous to the UPC barcodes used in
the grocery store that are scanned and identify the product.
Two specific regions of DNA are quite useful for classification: cytochrome c oxidase subunit I
(COI) and RuBisCo large subunit (rbcL). The COI gene resides on the mitochondrial genome
and is essential for electron transport during respiration. The rbcL gene is located on the
chloroplast genome and is essential for the uptake of CO2 in the first step of the Calvin cycle of
photosynthesis. Both mitochondrial and chloroplast genomes are highly abundant since there
are multiple genomes per organelle and multiple organelles per cell. This allows a researcher to
isolate DNA from small amounts of tissue and have enough template DNA for PCR. Also, both
COI and rbcL are highly conserved since they perform essential functions. Researchers can
thus use one pair of PCR primers for a wide range of species. Primers are designed from
regions with little to no variation, but there are more DNA substitutions in the regions between
the primers. This more variable region is amplified by PCR as shown in Figure 1, below. Fungi
rbcL for
variable nucleotides in this
region for different species
rbcL rev
PCR
Denature, Anneal, Extend 30X
in the presence of Taq
polymerase & deoxyNTPs.
Figure 1. PCR amplification of the
rbcL DNA barcode gene using
chloroplast DNA as the template.
The two primers used to prime DNA
synthesis are named rbcL for and rbcL
rev. PCR (Polymerase Chain Reaction)
is used to amplify the region between
the two primers. Primers are
designed from highly conserved
regions and anneal to the chloroplast
genome of most plant species. The
region between the primers is more
variable, so PCR products amplified
from different species will have
sequence differences in the region
between the two primers. dNTPs
indicates a mixture of deoxyATP,
deoxyCTP, deoxyGTP and deoxyTTP.
and animal classification is done using the COI DNA barcode, however rbcL is the DNA barcode
for plants since COI variation in plants is too low to distinguish different species.
The PCR products are then sequenced by GeneWiz, a company in San Diego, which makes the
DNA sequences available on their website. The sequences are then compared to a set of
reference rbcL sequences to determine closest similarity. The computational tools for
performing these comparisons are found on the Blue Line of the DNA Subway, an online
educational tool developed by the DNA Learning Center at Cold Spring Harbor, NY. The
general logic for DNA sequence comparison is shown in Figure 2.
ACGTGCTAGA
ACGTGCCAGA
ATGTGCCAGA
ACGTACCAGA
GCGTACCAAA
ACGTACCAAA
ACGTCCCAAA
Figure 2. Evolutionary changes in DNA sequence. The ancestral DNA sequence is shown on the
left. Two independent nucleotide changes, marked by circles, occur that differentiate two lineages.
Recall that a change that defines a new lineage is termed a synapomorphy. Over time, new
nucleotide changes, marked by triangles, occur to further differentiate these lineages.
The International Barcode of Life (iBOL) campaign organizes barcoding efforts for projects that
focus on certain types of organisms (i.e. ants, sharks, mosquitos) or certain types of
environments (i.e. marine, coral reef, polar). The iBOL website has lots of interesting
information (ibol.org); check out the About Us tab.
This lab will be divided into three sections.
1. Obtain a small leaf sample from one of the plants commonly found on campus and
isolate total DNA (nuclear, mitochondrial and chloroplast).
2. Amplify DNA by PCR using the rbcL primers, load gel to verify the presence of PCR
product (580 bp) before sending for sequencing. Gel can be viewed on BeachBoard.
3. Analyze sequence data using the DNA Subway Blue Line; discover which reference
barcode is most closely related to your plant species. Identify the family that your
species belongs to!
Concepts Reinforced in this Lab:
DNA Barcodes are small regions of organelle DNA that can identify an organism.
The gene encoding RuBisCo large subunit (rbcL) is the DNA barcode used for plants.
During the process of evolution, more DNA substitutions occur for more distantly related
species, while closely related species have a smaller number of DNA substitutions.
The polymerase chain reaction (PCR) amplifies DNA that is located between two primers. In
this case, chloroplast DNA is the template that is amplified by the rbcL forward and rbcL reverse
primers.
The PCR products are sequenced by dideoxysequencing which utilizes dideoxy nucleotides that
lack a 3’-OH group and serve as a chain terminator.
DNA sequences can then be compared to reference DNA barcodes to place a DNA sequence
into a phylogenetic tree.
Experimental Procedures
Day 1: Genomic DNA Isolation
1. Eight different plant species surrounding the MLSC building will be used in this exercise.
A numbered sign has been posted next to each plant type. When instructed to do so by
your instructor, you will view the plants selected for this experiment and CHOOSE ONE
for tissue collection. Be sure to take one P1000 pipette tips, one 1.5ml microfuge tube
and a marker with you before you leave the lab.
2. Using the wide end of a P1000 pipette tip, punch a disc of plant tissue into a 1.5ml
microfuge tube. This can be most effectively done by “sandwiching” leaf tissue between
the tip end and the tube opening and then pushing the tip into the tube. Close the cap
on the microfuge tubes and return to lab. The tip can be discarded in any trash
receptacle. Be sure to label your sample of plant tissue with the number on the sign and
your group name.
3. After returning to the lab, add 100μl of Nuclei Lysis Buffer to your tube.
4. Thoroughly grind the tissue in the tube using a clean blue plastic pestle for 1 minute.
5. Add an additional 500μl of Nuclei Lysis Buffer to the tube and then incubate the tube at
65°C for 15 minutes by placing them in any of the hot baths labeled “65°C” located
around the room. Be sure to wear the insulated gloves when adding and removing your
samples from the bath.
6. Add 3μl of RNase Solution to each tube, close the cap, invert several times to mix the
contents and incubate the tubes at 37°C for 15 minutes by placing them in any of the
baths labeled “37°C” located around the room.
7. Add 200μl of Protein Precipitation Solution to the tube, close the caps and then
vigorously shake the tube for 5 seconds to mix the contents.
8. Incubate the tube on ice for 5 minutes.
9. Place your tube across from another lab group’s tube to balance the centrifuge. Make
sure the tubes have their cap hinges pointing outwards in a microcentrifuge. Once all
groups’ tubes have been added, centrifuge the tubes at maximum speed for 4 minutes.
10. While the tubes are spinning, label 1 new microfuge tube. When the centrifugation is
complete, gently transfer 600μl supernatant from your tube into the new tubes. Do not
mix up the samples and be sure to use a new pipette tip. Also, do not disturb the pellet.
After the transfer, the tube with the pellet may be discarded into the room trash bins or
beakers labeled “used pipette tips”.
11. Add 600μl of isopropanol to the collected supernatant, close the caps, and invert each
tube several times to mix the contents. The isopropanol will cause the DNA to come out
of solution.
12. Place your tube across from other lab group’s tube with their cap hinges pointing
outwards in a microcentrifuge. Once all groups’ tubes have been added, centrifuge the
tubes at maximum speed for 1 minute. The DNA is now in the small pellet that is formed
at the bottom of the tube.
13. Carefully pour off (decant) the supernatant into one of the beakers labeled “Used
Isopropanol and Ethanol”. These beakers can be found near the sinks. Add 600μl of
70% ethanol to each tube, close the cap and gently invert the tubes three times to wash
the salt from the DNA pellets.
14. Place your tubes across from others with their cap hinges pointing outwards in a
microcentrifuge. Once all groups’ tubes have been added, centrifuge the tubes at
maximum speed for 1 minute.
15. Carefully decant the supernatant (into one of the beakers labeled “Used Isopropanol and
Ethanol”) and then use a fresh tip on a P100 to remove the residual 70% ethanol being
very careful not to disturb the pellet, which may or may not be visible. The pellet will be
located on the side of the tube with the cap hinge so don’t let the pipette tip touch this
part of the tube.
16. Allow the pellet to air dry for 15 minutes. During this time leave the cap open and rest
the tube on its side.
17. Add 100μl of DNA Rehydration Solution to the DNA pellet, close the cap and allow the
tube to incubate overnight at 4°C. The next day your TA will move your samples to a
minus 20°C freezer for storage.
Day 2: Polymerase Chain Reaction and Gel Electrophoresis
PCR Setup
1. Allow your genomic sample to thaw and then mix it by gently tapping the tube with your
finger.
2. Spin your tube briefly in a microcentrifuge to pull the contents down to the bottom being
sure to use another group’s tube as a balance.
3. Obtain one PCR tube containing Ready-To-Go PCR “beads”. Label the tube with your
group and sample numbers.
4. Add 23μl of the Primer/Loading Dye Mix and allow the beads to dissolve for 1 minute.
5. Using a fresh tip, add 2μl of genomic sample to the PCR tube.
6. Place the tube in the thermal cycler for amplification. Following amplification, spin your
tube down briefly in the microcentrifuge.
Agarose Gel and Electrophoresis Buffer Preparation (one agarose gel per two pairs of students)
1. Make 350 mL of 1X TBE (Tris Base/Boric Acid/EDTA) from 10X TBE stock. Transfer 40
mL of the 1X TBE to a flask and keep the remaining 310 mL in the large graduated
cylinder.
2. To the flask containing the 40 mL of 1X TBE add the appropriate weight in grams of
agarose to make 2% agarose (in this 40 mL of 1X TBE solution). Check your
calculations with your lab instructor before continuing.
3. Microwave for 1 minute. CAUTION: Flask will be hot. Use proper gloves for removal.
Carefully swirl and view the molten agarose to ensure that it has fully melted.
4. Allow to cool to 65 ºC and then bring flask to lab instructor, who will dispense 2.0 μL of
ethidium bromide into your flask –SUSPECTED CARCINOGEN---WEAR GLOVES!
5. Pour your gel and add the combs to make the wells. Allow to solidify—about 20 minutes.
Check with your instructor.
6. Turn the gel (after comb removal) to the proper orientation in the electrophoresis unit
and add the remaining electrophoresis buffer to the apparatus.
7. Load your gel as follows—
--Lane 1—4μl of group 1 PCR-amplified plant sample
--Lane 2—4μl molecular weight standard
--Lane 3 -- 4μl of group 2 PCR-amplified plant sample
8. Run gel at 120 V for 20 minutes.
9. Your laboratory instructor will assist you in interpreting your gel results. For each
sample you should have one band approximately 580 bp. For comparison, the
molecular weight standard has fragments of 100, 250, 400, 800 and 1,500 bp in length.
Your instructor will post an image of your gel on BeachBoard.
10. Provide the tube with your PCR product to your lab instructor for sequencing. Make sure
your tube is CLEARLY labelled with your lab group’s initials and plant number. If your
PCR was successful, two sequencing reactions will be set up, one with the rbcL for
primer and one with the rbcL rev primer. 10 μl of PCR product will be used in each
dideoxy sequencing reaction.
Day 3: Interpreting Sequencing Results and Phylogenetic Tree Building
1. Set up an account with DNA Subway (dnasubway.iplantcollaborative.org). This should
be done the week before the data are analyzed since it takes 24 hours. Each lab group
should set up one account with a shared username and password.
2. Login to DNA Subway, and click on the blue box towards the left of the screen
Determine Sequence Relationships.
3. Under Barcoding, select rbcL, then select Import trace files from DNALC. The
sequenced obtained by GeneWiz are automatically uploaded to the DNALC.
4. Search for the name provided by your TA, and click on the tracking number to get a list
of sequences.
5. Select your sequences. There should be an F and R for your DNA sample. Click on
Add Selected Files, and let the green bar go across the screen until files are uploaded.
6. Give a Project title that includes your group name and plant numbers. Adding a
description is optional. Click on Continue.
7. Now the blue line is apparent. There are three hubs: Assemble Sequences, Add
Sequences and Analyze Sequences. We will take the DNA subway down the Assemble
Sequences branch line first.
8. Click on Sequence Viewer to see your two DNA sequences. By moving the gray bar
below the sequences, you can see the entire sequence. At the beginning and end there
will be many N’s. N’s are read by the ABI machines when there are multiple strong
peaks in one location and a base cannot be called. There should not be many N’s in the
middle of your sequence.
9. You can view the electropherogram by clicking the View Trace icon next to the sequence
name. The four differently color traces show the signal for the four bases; green = A, red
= T, blue = C and black = G. A black peak, by itself, is called a G by the sequencing
machine software.
10. Just below the called letters is the Quality score shown by vertical blue bars. This is
measured by the Phred Score. A Phred Score of 20 signifies that the probability that the
base was called incorrectly is 1 in 100. Phred scores above 20 are considered reliable,
and the horizontal blue line is 20 Phred units. Click on the View Trace icon for each
sequence; be sure to note if you get a “Low Quality Score Alert”. This sequence may
need to be removed from the analysis.
11. Close the Sequence Viewer box and click on Sequence Trimmer. This will trim the N’s
off of the ends of each sequence. When done, click again to see trimmed sequences.
Close Sequence Trimmer box.
12. Click on Pair Builder and the Pair Builder box opens up. For each DNA sample, there
should be an F and R. Click on the box to the right of the sequence for the F and R of
your DNA sequence. Once you check the two boxes, a window will pop up that asks
you to Pair them? Click yes.
13. For the R sequence, click on the blue F to the right of the sequence, and it will turn into a
red R. This means that the reverse complement of this sequence will be used in the
sequence alignment. This is done so each sequence is on the same strand.
14. If you had a low quality score alert on a certain sequence, do not include it in the pair
building. Just use the high quality sequence in your analysis below.
15. Click on SAVE to save your pair. Then click on Consensus Builder to view your
consensus DNA sequence based on your DNA sequences that were obtained.
16. The Consensus Editor will open when Consensus Builder is clicked. If your DNA
sequences are high quality, there will be none to only a small number of yellow
mismatches. If there are many mismatches it will be necessary to remove a low quality
sequence from the analysis. In the Editor, you can change the name of your sequence
and give it the common name provided by your TA for the species you analyzed. Click
Edit Name, change name, and then Save.
17. Now you are ready to take the DNA Subway down the Add Sequences branch line. Skip
the first two stops and click on “Reference Data”.
18. Click on Common plants, then Add ref data, and close box.
19. Now you are ready to take the DNA Subway down the Analyze Sequences branch line.
Click on Select Data. Click Select all to add the User data and the Reference data set
(Common plants) to be used in the analysis. Click on Save Selections, which will close
the box. The Common plants are a wide variety of angiosperm (covered seeds) and
gymnosperm (naked seeds: fern, gingko, pine) plants. The angiosperms contain
representatives from both monocotyledons (asparagus, wheat, corn and rice are
examples) and dicotyledons (magnolia, sunflower and broccoli are examples). If one of
your sequences was low quality and NOT used in pair building, then be sure to choose
the high quality score in the Select data window.
20. Click on MUSCLE to align the sequences. The light will flash yellow and red while the
program is running and then change to solid green with a white V when it is ready to be
viewed. Click on MUSCLE to open Alignment Viewer box. Your lab group’s sequence
will show up on the alignment with vertical bars that show nucleotide differences. More
closely related species will be near one another in the MUSCLE alignment. Close
Alignment Viewer box, and click on PHYLIP NJ.
21. View output of PHYLIP NJ by clicking on icon when white V is visible on a green
background. A phylogenetic tree built by Neighbor Joining (NJ) will be visible.
Questions: Print and attach your final NJ tree and answer the following questions:
Technical Questions
1. During the genomic DNA isolation, Protein Precipitation Solution was added to the
tube, and then the tube was centrifuged and the supernatant was transferred to a
fresh tube. Explain what was happening to proteins during these steps and why
these steps are needed for purifying genomic DNA.
2. Which primers were used for PCR? Why did these primers work for PCR
amplification in all the plant species located outside of MLSC?
3. Why is ethidium bromide added to the agarose gel solution? While the gel was
running, was the DNA visible? What is needed to be done to visualize the DNA
fragments in the gel?
4. What does an N in a DNA sequence signify?
5. Why did the reverse complement of the R sequence need to be used in the analysis?
Draw out a DNA sequence with 5’ and 3’ ends to answer this question.
Phylogenetic Tree Questions
1. Look at the tree as a whole. Do the gymnosperms, monocotyledons and
dicotyledons form different clades? Show on your tree these three groups (see point
19 above). In some cases, one representative from one group may be in an odd
place; likely due to low numbers of sequences in this analysis. Why would it help to
have more species in the analysis; for instance many more monocotyledons?
2. The numbers on the NJ tree are the probability that the relationships are correct.
100 means that if the analysis were done 100 times, the separation adjacent to the
number would always be made. If the value is 45, this means the separation would
be observed 45/100 times. Values must be greater than 50 to be considered high
confidence. Circle the numbers that separate gymnopserms from angiosperms and
monocotyledons from dicotyledons. Are these values above 50?
3. Look for the closest relatives to your DNA sequence. These may be in the same
family as your sequence or could be in a closely-related family. Using the common
name of the species that were used for DNA isolation and sequencing, search for the
correct family using the internet. The Common plants reference list only has a
subset of families, so you may not have a representative from your family in this
phylogenetic tree. Report which family is closest and the correct family for your
species. If you were studying this family in more detail, what types of reference
barcodes would provide a more meaningful tree?
Download