DNA Extraction Lab report - EdSpace

advertisement
Jones 1
THE STUDY OF THE EVOLUTION OF MODERN HUMANS THROUGH THE
ISOLATION, AMPLIFICATION, AND EXAMINATION OF THE D-LOOP SEQUENCE
Debra Jones
Abstract:
During this experiment the genomic DNA of an entire class population of 54 students
was isolated, and the D-loop sequence from the mitochondrial DNA was isolated and amplified
through PCR and gel electrophoresis. These DNA sequences were then compared to sequences
from a chimpanzee, a Neanderthal, and from other humans from the class population and from
around the world. The global sequences used were obtained from the NCBI BLAST database
and search tool. The average proportional divergence between humans and Neanderthals was
found to be 0.058508182, and the divergence time was calculated to be 761,811.2802 years. The
average divergence between chimpanzees and humans was found to be 0.16562, and the
divergence time was already given as 5,000,000 years. The average proportional divergence of
modern humans (the class population) was found to be 0.015242915, and the last common
ancestor of modern humans was calculated to have occurred 424,161.7184 years. This
divergence time matches the displacement model of human evolution.
Introduction:
Mitochondrial DNA is found as a circular genome that is found in the
mitochondria. It was chosen for this experiment because it is quickly and highly amplified.
Hundreds of mitochondria can be found in each cell of the body, and each mitochondria has
several copies of its own genetic material (Genetics Home Reference). These high levels of
mitochondrial DNA allow for it to be easily isolated from small samples cells.
Jones 2
The section of DNA that was sequenced was the D-loop region. The D-loop region is a
non-coding region of mt DNA. It is approximately 1,200 nucleotides in length on each side of
the initial position of the mitochondrial genome. During replication, this segment of DNA
creates a loop that can be easily isolated. It is for this reason that the D-loop sequence was
chosen to be isolated. Also, the D-loop sequence experiences frequent mutations and is highly
irregular. The mutations that occur in the D-loop DNA are inherited by an organisms offspring.
Therefore, the mutations can be easily tracked back to thousands of years (Olivo et al, 1983).
From this, evolutionary patterns can easily be followed, and a common ancestor from which all
of the observed organism are related to can be found
During this experiment, a 400 nucleotide sequence of mitochondrial DNA was extracted
from my own cheek cells, and from cheek cells of the entire classroom population. The extracted
DNA was isolated and amplified using PCR and a thermocycler. PCR allows for high levels of
amplification in a very short amount of time. The DNA obtained from the cheek cells was used
as a template strand. A primer was added to the target sequence, and DNA polymerase paired
nucleotides to the template strand starting at the 3’end of the primer (Shi and Chiang, 2005). This
creates a complimentary fragment that is identical to the other single stranded template of DNA.
Through this process, the desired region of DNA was able to be greatly amplified.
These amplified segments of DNA were then run through gel electrophoresis, the DNA
strands were separated by size, and the DNA was sequenced. The results were compared to the
DNA sequences of the rest of the class. Most of mtDNA of the class population had different
DNA sequences. These differences are called single nucleotide polymorphisms, or SNPs. SNP is
a type of variation in DNA where a single nucleotide differs between organisms (Altshuler et al,
2000). These SNPs were used in order to analyze the discrepancies between human DNA from
Jones 3
both the class population, and from the global population. The DNA sequences from the global
population were obtained from the GenBank database of the National Center for Biotechnology
Information (NCBI). The BLAST function was used in order to find DNA sequences that were
similar to mine. BLAST compared my DNA sequence to all of those that were stored in the
database, and provided a list of the top 100 DNA sequences that were most similar to the
sequence that was input into the tool. Each sequence contained a summary that included the size
of the region of the sequence, the Max score, the E value, and the % identity. The differences
between the sequences of DNA allow for patterns of evolution to be observed.
The DNA isolated from my lab partner and myself was also compared to determine the
proportional divergence, the average number of substitutions, and the divergence time between
her and myself. Her and I are twins, so we hypothesized that we had zero divergence, zero
nucleotide substitutions, and that the divergence time would also be zero.
. The mutations that were found within the D-loop DNA can be traced back to a common
ancestor. The reason that our DNA differs from that of our common ancestor is because when an
organism splits off from that common ancestor, the begin to accumulate mutations. It is from this
theory that the molecular clock arose. The molecular clock theory claims that a linear
relationship occurs between the number of mutations an organism experiences and the time since
it had diverged from its common ancestor. This theory can be used to compare two groups of
organisms, so long as the amount of time since the two organisms had shared a common ancestor
is proportional (Gojobori et al, 1990).
From the divergence time of two organisms since their last ancestor, the theory of
how they evolved was hypothesized. One theory, the multiregional theory, states that several
Jones 4
different global archaic populations contributed to the evolution of modern humans. The
divergence time for this model is 1.5 million years to 3 million years. The divergence model’s
divergence time is much shorter, at 200,000-500,000 years. This theory states that a single
population that was located in Africa evolved to form modern humans.
Materials and Methods
DNA Extraction:
To extract the cells that were needed for PCR amplification, 10 mL of 0.9% NaCl saline
solution was swished around in one’s mouth for thirty seconds. This was done to collect cells
that would detach from the walls of one’s cheek. 1.5 mL of this DNA solution was transferred to
a 1.5 mL tube. The sample of DNA solution was then placed in a microcentrifuge and spun for
five minutes at maximum speed. Most of the supernatant was removed, and the remaining pellet
that contained the cells was resuspended. This resuspended cell solution was then transferred to a
microcentrifuge tube that contained 100 µL of Chelex solution. The Chelex binds to the DNA
and prevents degradation.
This solution was heated in thermocycler for ten minutes at 100°C. The heating of the
DNA denatured it, and caused the double helix to unwind, thus creating single stranded DNA.
After the denaturing step, the DNA sample was spun in a microcentrifuge for a minute. This
causes the DNA to become suspended in the supernatant, and the unwanted cell parts and Chelex
beads to pellet at the bottom of the tube. 50 µL of the DNA containing supernatant was
transferred to a fresh 1.5 mL tube, and the DNA was stored in a freezer for one week.
DNA Amplification by PCR
Jones 5
22.5 µL of a mixture containing the primers and the ddH2O was added to a tube that
contained a ready-to-go-PCR bead. The ddH2O and primers mixture contained the forward
primer HVIF15971: 5'-TTAACTCCACCATTAGCACC-3', the reverse primer HVIR16410: 5'GAGGATGGTGGTCAAGGGAC-3', and ddH2O, while the ready-to-go PCR bead contained
1.5 units of Taq DNA polymerase, 10mM TrisHCl, 50mM KCl, 1.5mM MgCl2, and 200µM of
each dNTP. 2.5 µL of my genomic DNA was added to this mixture. The whole solution was
mixed, then run through 30 cycles on a thermal cycler set to the following program: Initial
denature of DNA for 2 minutes at 94°C, Denaturing for 30 seconds at 94°C, Annealing for 30
seconds at 58°C, Extending for 30 seconds at 72°C. Steps 2-4 are then repeated thirty times
before its final extension for 6 minutes at 72°C, and indefinite hold at 4°C.
The primers are nucleic acids that are added to the DNA in order to create a starting point
for DNA synthesis. The Taq DNA polymerase binds to the 3’ end of these two primers and adds
bases to replicate identical strands of each single strand. All other portions of the PCR mixture
aid in this process. The steps of the thermo cycler also aid in this process. The denaturing steps
separate the double helix to create single strands. The annealing step is when the DNA primers
bind to the single strands of DNA. The extending step is when, starting at the 3’ end of the DNA
primer, the Taq polymerase adds bases to the single strands of DNA. The repeating of this
process allows for efficient amplification of the section of genomic DNA that is being studied.
Agarose Gel Electrophoresis:
A 1.0% agarose gel was prepared by American University graduate students. 25 µL of
the PCR amplified DNA was added to a single well of the agarose gel. A marker containing 1Kb
Jones 6
Plus DNA Ladder was added to another well. The gel was run for an hour to allow the DNA to
separate.
Sequencing of the Gel-Purified D-loop PCR Product and Electrophoresis:
3.4 µL of the column-purified PCR template was added to the DNA sequencing master
mix that contained dNTPs, sequencing primers, ddH2O, DNA polymerase, and sequencing
reaction buffer. Next 1.8 µL of this mixture was added to each ddNTP tubes (one contained
ddATP, another ddCTP, another ddGTP, and finaly ddTTP). Each ddNTP lacks the 3’ –OH
group which is necessary for elongation. Therefore, the ddNTP acts as a chain terminator. Each
ddNTP binds to a different base on the template DNA strand, and they create different lengths of
copied fragments.
After the Master Mix, Template, and ddNTP had been combined, the mixture was placed
on ice before being placed in the thermo cycler. This was done to prevent degradation of the
enzymes. The mixture was put through 30 cycles in the thermocycler. The cycles are as follows:
2 minutes at 92°C, 30 seconds at 92°C, 30 seconds at 53°C, and 1 minute at 70°C. Steps two
through four were repeated an additional 29 times. After the 30 cycles were complete, the
reaction was held at 4°C. This process allowed for the ddNTPs to bind to the single stranded
templates, and for many fragments of these shortened DNA strands to be created.
After the DNA fragments were amplified, they were first run through a polyacrylamide
gel, then they were run on an automated sequencer. Running the fragments through the
polyacrylamide gel allowed the different fragments to separate by length. The longest ones
stayed towards the top, while the smaller ones travelled all the way through the gel towards the
bottom. By reading the gel from the bottom to the top, the sequence of the fragment of DNA
Jones 7
being studied could be ascertained. However, this method is very time consuming, so the
automated sequencer was used in order to catalyze the process.
Nucleotide Sequence searches in the NCBI Database:
For this process, the National Center for Biotechnology Information’s genome database,
GenBank, was used. The Basic Local Alignment Search Tool (BLAST) was used. My D-loop
nucleotide sequence, the query sequence, was entered into the data base to be compared to other
sequences that were stored within the genome database. The database that was used was the
Nucleotide collection (nr/nr), the organism was Homo sapiens, and the optimization was for
somewhat similar sequences (blastn). Once all of the data was entered, the data base will find
similar or identical sequences that matched my query sequence. The matched sequences included
an accession number that linked the sequence to the data base, a Max score (S), and an E score.
A high Max score indicated a more significant match, while conversely, an E score that was
closer to 0 was indicative of a more significant match.
For this part of the experiment, the two sequences that were the most similar to my query
sequence were examined, and the accession number, the max score, the identity, and the
associated nucleotide positions were recorded.
Calculating Divergence:
The average proportional divergence between humans and chimpanzees and between
humans and Neanderthals was calculated by taking the average of the humans (done using excel)
and combining them and comparing them with the average of the chimpanzee and the
Neanderthal respectively. From the average proportional divergence (Pd), the number of
substitutions per site (Kn) was calculated. This conversion was performed by using the following
Jones 8
equation: -ln [1-Pd]. The average proportional divergence was then taken from all of the humans
in the class by taking the average of all of the divergence. Then, the same Kn equation was used
to find the average number of substitutions per site.
Next the rate of nucleotide substitution between humans and chimpanzees, humans and
Neanderthals, and myself and my lab partner was calculated. The human and chimpanzee
nucleotide substitution rate was found by dividing the Kn of Humans and chimpanzees by the
human chimpanzee divergence time of 5,000,000 years. For the human and Neanderthal
nucleotide substitution rate, their Kn rate was taken and divided by the modern human
divergence of 200,000 years. For the nucleotide substitution rate between my lab partner and
myself, the Kn value was divided by the rate of substitution between Neanderthals and modern
humans that was previously calculated.
From these calculated rates of substitution, the divergence time was calculated. For the
divergence between chimpanzees and humans, this was done by dividing the Kn of humans by
the rate of nucleotide substitution for humans and chimpanzees. The divergence between
Neanderthals and humans was calculated by dividing the Kn of humans and Neanderthals by the
rate of substitution between humans and Neanderthals. The divergence time was again calculated
to determine the divergence time between my lab partner and myself. Again, this was done by
dividing the Kn values of her and myself by the rate of substitutions between Neanderthals and
humans.
Results:
My D-loop DNA was successfully sequenced, and the resulting 400 base pair sequence
was compared to other sequences that were stored in the GenBank data base through BLAST.
Jones 9
Ten separate sequences of DNA were found to have 100% identical sequences in the observed
region to my DNA. The accession numbers for the ten matching sequences can be found in table
1. For each of the ten sequences, the Max score was found to be 722, and the E score was zero.
The match number for each of the ten sequences from the data base matched my DNA 400/400
base pairs. For the first match (KC878724.1), the nucleotide positioning was 15992-16392. The
remaining nine matches all have nucleotide positioning of 15991-16390. All of the matches and
my DNA sequence were of the K1a1b1a haplogroup, which is commonly found in Ashkenazi
Jews (Table 1).
Accession number
Max score
E value
% identity
Nucleotide
postioning
Haplogroup
Match
Accession number
Max score
E value
% identity
Nucleotide
postioning
Haplogroup
Match
1st match
KC878724.1
722
0
100%
Query: 1400
Subj:
1599216392
K1a1b1a
400/400
6th
JQ705204.1
722
0
100%
Query: 1400
Subj:
1599116390
K1a1b1a
400/400
2nd match
KC914580.1
722
0
100%
Query: 1400
Subj:
1599116390
K1a1b1a
400/400
7th
JQ704654.1
722
0
100%
Query: 1400
Subj:
1599116390
K1a1b1a
400/400
3rd
JQ706006.1
722
0
100%
Query: 1400
Subj:
1599116390
K1a1b1a
400/400
8th
JQ703855.1
722
0
100%
Query: 1400
Subj:
1599116390
K1a1b1a
400/400
4th
JQ705745.1
722
0
100%
Query: 1400
Subj:
1599116390
K1a1b1a
400/400
9th
JQ703485.1
722
0
100%
Query: 1400
Subj:
1599116390
K1a1b1a
400/400
5th
JQ705628.1
722
0
100%
Query: 1-400
Subj: 1599116390
K1a1b1a
400/400
10th
JQ703069.1
722
0
100%
Query: 1-400
Subj: 1599116390
K1a1b1a
400/400
Table 1 This table includes the data collected from the GenBank
genomic data base. All of the data was collected using BLAST. The data
includes the first ten hits on BLAST, and the accession number, max
score, % identity, nucleotide positioning, match number, E value, and
haplogroup of each sequence was collected.
Jones 10
My D-loop sequence was not only compared with this data base, but also with those of
the members of the class, and with two different species: chimpanzees and Neanderthals. The
proportional divergence, number of nucleotide substitutions, rate of nucleotide substitutions, and
divergence time were calculated. The most divergence was observed between the human and
chimpanzee DNA. The proportional divergence was calculated to be 0.16562, the number of
substitutions per site was 0.181066345 per site, and the divergence time was given to be
5,000,000 years (Tables 2 and 3).
The second most divergence can be seen between humans and Neanderthals. The
proportional divergence was 0.058508182, and the number of nucleotide substitutions was
0.060289621 substitutions per site (Table 2). From the rate of substitution of humans and
Neanderthals (7.6801412x10^-8 substitutions per site per year), the divergence time was found
to be 761,811.2802 years (Tables 2 and 3). The rate of human divergence observed in the
classroom population was more close to the human versus Neanderthal divergence that the
human versus chimpanzee divergence. The proportional divergence of the classroom population
was 0.015242915, the number of nucleotide substitutions was 0.015360282 substitutions per site
(Table 2). The divergence time, also known as the amount of time to which the class population
had shared a common ancestor, was 424,161.7184 years (Table 3).
The divergence between my DNA and that of my lab partner, Julia, was also calculated.
It should be noted that Julia and I are twin sisters. Therefore, the proportional divergence was
zero and the number of substitutions was zero substitutions per site (Table 2). The divergence
time was therefore calculated to be zero years (Table 3). Our D-loop sequences were completely
identical.
Human v Chimp
Average Proportional
Divergence
Average number of
substitutions
per site
Rate of nucleotide
substitution
(substitutions per site per
year)
Human v
Neanderthal
Human v Human
Julia v Me
Jones 11
0.16562
0.058508182
0.015242915
0
0.181066345
0.060289621
0.015360282
0
3.621326898x10^8
7.6801412x10^-8
3.621326898x10^-8
7.6801412x10^-8
Table 2 The calculated values of average proportional divergence, average number of
nucleotide substitutions per site, and the rate of nucleotide substitutions for human DNA
versus chimpanzee DNA, human DNA versus Neanderthal DNA, the DNA of the
classroom population, and Julia’s DNA versus my DNA.
Divergence time (years)
Human v
Human v
Human v
Chimp
Neanderthal
Human
Julia v Me
5000000
761811.2802
424161.7184
0
Table 3 The divergence time, in years, was calculated for humans versus chimpanzees,
humans versus Neanderthals, the Humans in the classroom population, and Julia versus
myself.
Discussion:
From the data collected, several patterns of evolution can be observed. The easiest pattern
of evolution to observe is the divergence time between the observed organisms. The divergence
time is the point in the past where the organisms had a common ancestor. The longest divergence
time that was observed was between chimpanzees and humans. Chimpanzees and humans last
shared a common ancestor 5,000,000 years ago. The second longest divergence time was
between Neanderthals and humans. Their last common ancestor was alive 761,811.2802 years
ago. This is also the point in time in which modern humans split off and evolved from
Neanderthals. The common ancestor between all modern humans was alive 424,161.7184 years.
This is the shortest divergence time. The data indicates that the closest relative to humans is first
Neanderthals, followed by chimpanzees. In fact, other studies have shown that the Neanderthal is
the closest known relative to modern humans (Noonan, 2010).
Jones 12
Population geneticists have calculated that the divergence time for modern humans was
200,000 years ago. The calculated divergence time from the data is about twice as long as this
time. This could be due to an over representation or under representation of certain populations
and DNA types. For example for the chimpanzees and the Neanderthal data comes from one
sample of DNA each, while the human DNA came from the entire class population of 54
humans. This could cause some fluctuations in the calculations and averages.
The theory that the Neanderthal is the closest known relative to modern humans is again
supported by the average proportional divergence and average substation per site rate. The
average proportional divergence between Neanderthals and humans is 0.058508182, while the
substation rate is 0.0602896213 substitutions per site. Between humans and chimpanzees, the
average proportional divergence was much larger, at 0.16562, while the substitution rate was
0.1810663449 substitutions per site. Compared to the average proportional divergence of
0.015242915, and the average substitution rate of 0.0153602824 substitutions per site that was
taken from the class population, the Neanderthal DNA sequence diverged much less, i.e. was
much closer to that of the modern human.
Based on the divergence time that was calculated from the average substitution rate, the
model of human evolution was determined to be the displacement model. The displacement
model of evolution, also known as the single origin model of evolution, suggests that Homo
sapiens came from a single starting population in Africa about 200,000-500,000 years ago
(Hansen et al, 2000). This model is the best fit, because the calculated divergence time is within
the estimated arising of the single starting human population. The data does not support the
multiregional model, because the estimated time of evolution for this theory is much longer.
Jones 13
Similar calculations were done to determine the point of divergence between my lab
partner and I. As previously mentioned, my lab partner and I are twins. From this, we had
hypothesized that we would have an identical D-loop sequence. This hypothesis was supported
by the data obtained. The DNA sequences that were isolated were identical. This means that the
proportional divergence and the number of substitutions between us is zero. Therefore, the
divergence time is also zero.
Both my partner and I used BLAST to compare our D-loop sequence with those that were
stored in the GenBank database. We both had the same ten matches. All of them had a max score
of 722, an E value of 0, and a % identity of 100%. The high max score and the low E value
indicate a significant match. For all ten observed sequences, all 400 of the bases were identical.
This sequence had the haplogroup of k1a1b1a. This haplogroup is found in Ashkenazi Jews and
Europeans. 10% of Europeans and 45% of Ashkenazi Jews fit into this haplogroup. Also, about
19% of Ashkenazi Jews that fit into the k1a1b1a haplogroup are of Polish descent (Grzybowski
et al, 2007). This haplogroup matches both my lab partner’s and my heritage. We are both
Ashkenazi, and distantly are of Polish descent on out maternal grandmother’s side. The data that
was obtained from the comparison of the D-loop sequence matches the personal data of my lab
partner and myself.
References:
Altshuler D, Pollara VJ, Cowles CR, Van Etten WJ, Baldwin J, Linton L, Lander ES. An SNP
map of the human genome generated by reduced representation shotgun sequencing. Nature 407:
513-516, 2000.
Gojobori T, Moriyama EN, Kimura M, Molecular clock of viral evolution, and the neutral
theory. CrossMark 87(24):10015-10018: 1990.
Grzybowski T, Malyarchuk BA, Derenko MV, Perkova MA, Bednarek J, Wozniak M, Complex
interactions of the Eastern and Western Slavic populations with other European groups as
Jones 14
revealed by mitochondrial DNA analysis. Forensic Science International: Genetics 1(2): 141147, 2007.
Hansen TF, Armbruster WS, Antonsen L, Comparative Analysis of Character Displacement and
Spatial Adaptations as Illustrated by the Evolution of Dalechampia Blossoms. The American
Naturalist 156(S4):S17-S34, 2000.
Olivo PD, Van de Walle MJ, Laipis PJ, Hauswirth WW, Nucleotide sequence evidence for rapid
genotypic shifts in the bovine mitochondrial DNA D-loop. Nature 306: 400-402, 1983.
Noonan JP, Neanderthal genomics and the evolution of modern humans. Genome Research
20:547-553, 2010.
Shi R, Chiang VL, Facile means for quantifying microRNA expression by real-time PCR.
BioTechnniques 39: 519-525, 2005.
"Mitochondrial DNA." - Genetics Home Reference. N.p., n.d. Web. 01 Apr. 2014
Download