T - Crime Scene

advertisement
DNA Profiling using Short
Tandem Repeats
This slide show includes information from the following website:
http://www.cstl.nist.gov/biotech/strbase/
WWW.CRIMESCENE.COM thanks John M. Butler, Ph.D at the NIST Biotechnology
Division for his help and permission to include information and graphics in this
presentation.
•
•
DNA Profiling using STRs:
An Overview
STR analysis of DNA samples is a DNA profiling technique that uses PCR (polymerase
chain reaction) to copy samples of DNA at distinct locations and analyze their sizes (the
size of the DNA piece being the basis of comparison between samples).
STRs are Short Tandem Repeats of patterns of nucleotides spread throughout our DNA
AATG AATG AATG AATG AATG AATG AATG
DNA molecule
7 short, tandem (back to back) repeats of the nucleotide sequence AATG
•
•
•
•
•
The number of repeats at a certain distinct region (locus, plural=loci) of DNA is highly
variable from person to person allowing their use in human identity testing
The number of nucleotides involved in the repeats can vary between 9 and 80 (called
variable number of repeats, VNTRs, or minisatellites) or between 2 and 5 (called
microsatellites, SHORT tandem repeats, STRs)
Several loci along our DNA have been identified as possessing STRs (thanks in part to
the Human Genome Project), and the DNA profiling community has selected 13 regions
for identity analysis
These 13 loci ALL contain 4 nucleotide (tetrameric) repeats
Through population studies, the numbers and types (nucleotides involved) of these
repeats at these loci have been analyzed affording probability estimates in certain
ethnicities
13 STR Loci for DNA Profiling
•13 STR loci have been officially chosen to be used in the Combined
DNA Index System (CODIS) and are scattered among our 23 chromosome
pairs
•CSF1PO, FGA, TH01, TPOX, vWA, D3S1358, D5S818, D7S820,
D8S1179, D13S317, D16S539, D18S51, and D21S11
•While not an STR, the AMEL (amelogenin) locus on chromosome 23, the
sex chromosome, is included for gender determination
TPOX
D3S1358
D8S1179
THO1
D5S818
VWA
FGA
CSF1PO
D7S820
AMEL
D13S317
D16S539
D18S51
D21S11
AMEL
DNA Profiling using STR:
An Overview
•
•
•
•
Because we have 23 pairs of chromosomes (23 chromosomes from our mother, 23 from
our father), the loci are actually duplicated; each of the 13 CODIS loci exist on one
chromosome from our mother and one from our father.
Therefore, when analyzing the number of repeats at a certain locus, generally we get 1
(the number of repeats is the same) or 2 (the number of repeats is different) answers,
representing the pair. These are known as alleles. We get one allele from each parent.
If the number of repeats is the same, we are “homozygous” at that locus. If the number
of repeats is different, we are “heterozygous” at that locus. Any variety of homozygosity
or heterozygosity can be present at the 13 loci.
Therefore, you can see how STRs can be used for maternity and paternity testing.
AATG AATG AATG AATG AATG AATG AATG
7 short, tandem repeats of the nucleotide sequence AATG
AATG AATG AATG AATG AATG AATG AATG AATG
AATG
8 short, tandem repeats of the nucleotide sequence AATG
DNA molecule
from mother
DNA molecule
from father
Advantages of STR Analysis
•
•
The allelic variation (number of repeats) of STRs is more easily discernable than other
techniques (a difference in repeat of just one, or 4 nucleotides, can be seen with current
methods
The number of repeats at the STR loci is discrete, meaning from current studies, there
are a set amount of answers, facilitating interlaboratory comparisons.
–
•
•
•
E.g., at locus THO1, which is found on chromosome 11, studies have shown that the tetramer
AATG repeats anywhere from 3 to 14 times. If one strand of DNA (from your mother) contains
3 repeats and the other (from your father) contains 5 repeats, the profile is dubbed “3,5”. This
profile is then used for comparison to other DNA samples.
Because PCR (polymerase chain reaction) is used to amplify the STR loci, very small
quantities of DNA are needed (a blood droplet the size of the head of a pin)
Because the size of the STR loci are relatively small, the odds that the STR locus will be
completely intact and therefore available for analysis in a degraded DNA sample (DNA
that has already been somewhat cut from just being deposited in environments where
decomposition could occur) are higher
Technology allows complete analysis in a matter of hours
Sources of Biological Evidence
•
•
•
•
•
•
•
•
Blood
Semen
Saliva
Urine
Hair
Teeth
Bone
Tissue
DNA in the Cell
chromosome
cell nucleus
Double stranded
DNA molecule
Orange = A
Target Region for PCR
T
C
G
Green = T
Purple = C
Yellow = G
T
C
T
G A
A
A
T
C
A
T
T
G
C
AC
IndividualA
T G
nucleotides G A
(A, T, C, G)
Steps in DNA Sample Processing
Sample Obtained from
Crime Scene or Paternity
Investigation
Biology
DNA
Quantitation
DNA
Extraction
PCR Amplification
of Multiple STR markers
Technology
Separation and Detection of
PCR Products
(STR Alleles)
Comparison of Sample
Genotype to Other
Sample Results
Sample Genotype
Determination
Genetics
If match occurs, comparison
of DNA profile to population
databases
Generation of Case
Report with Probability
of Random Match
DNA 101
•It has been mentioned that the repeats of STRs are composed of nucleotides.
Amazingly, the genetic code (the DNA represented by all our 23 pairs of
chromosomes) is composed of only four nucleotides in a string: Adenine (A),
Thymine (T), Cytosine (C) and Guanine (G). These are the sole letters of the genetic
alphabet.
•Nucleotides are also known as nitrogenous bases, or just “bases”.
•Adenine and guanine are known as the purine nitrogenous bases, while cytosine and
thymine are called the pyrimidine bases; adenine binds only to thymine and cytosine binds
only to guanine.
•In a DNA molecule (on just one chromosome), the structure looks like a twisted ladder, with
the rungs representing the pairs of the nitrogenous bases. Nucelotides are therefore also
termed base pairs, or bps, when talking about the “double stranded” DNA molecule.
•The pattern of these letters constructs genes, that in turn act as templates for proteins, that
in turn help to construct and operate the human body. Yet there is enough variation to make
us all unique. Mind boggling.
Individual
nucleotides
Orange = A
Green = T
Purple = C
Yellow = G
C T
TG
A T
C
C
T
G A
A
A
G
C
AC
TG
T
T
G T
C
A
T
A
GA
C
G A
DNA Molecule on one
chromosome
The Polymerase Chain Reaction
•The amount of DNA represented in a pure sample dwarfs the amount represented
in just the 13 CODIS loci. Therefore, these regions, and only these regions, need to
be magnified for analysis, and the polymerase chain reaction (PCR) is used as a
molecular Xerox machine just for this purpose.
•PCR employs the use of primers, which are short pieces of single stranded DNA
complementary to areas along a certain piece of DNA that you want to magnify (i.e.
the THO1 locus).
•Using the rule from biology 101 that A binds to T and C to G, primers are designed
(and subsequently synthesized in a laboratory very easily and quickly) using this
rule to bind to a region of DNA just BEFORE the STR region on one DNA strand and
just AFTER the STR region on the complementary DNA strand of the double
stranded DNA molecule on each chromosome (e. g. chromosome 11 for the THO1
locus).
Region for primer binding
A G C A T A A T T C A A T G A A T G A A T G C G T A C C T A
T C G T T T T A A G T T A C T T T G T T A C G C A T G C A T
STR Region in red (3-AATG repeats)
Region for primer binding
- - - - Rest of DNA
on chromosome 11
from one parent
DNA Amplification with the
Polymerase Chain Reaction (PCR)
•The DNA sample is heated to allow the double stranded DNA to “denature” or become
single stranded, so that the single stranded primers can get in. The reaction is cooled so
that the primers can bind (anneal) to their complementary regions of the sample.
•An enzyme (DNA polymerase, which polymerizes the nucleotides to the primers), along
with spare nucleotides, is added to this mixture, and the enzyme begins to extend the
primers along the regions of interest (i.e. THO1) in complementary fashion, basically
copying the THO1 region and its various number of tetrameric repeats.
•This reaction (denaturing, annealing, extension) is allowed to repeat itself (in a
thermocycler) many times, magnifying the DNA locus, and ONLY that locus, many times.
Single stranded
(denatured) DNA
Single stranded
(denatured) DNA
A G C A T A A T T C A A T G A A T G A A T G C G T A C C T A
C G C A T G C A T
DNA polymerase
A
T
G
Forward primer
Reverse primer
G
C
T Spare
T nucleotides
G C
A G C A T A A T T C
CA
A
T C G T T T T A A G T T A C T T T G T T A C G C A T G C A T
DNA Amplification with the
Polymerase Chain Reaction (PCR)
•Because the newly synthesized fragments of DNA are used for the
second round of synthesis and have one finite end representing the
beginning of the primer, subsequent cycles will produce an excess of
ONLY the region of interest (beginning with the start of the forward
primer to the end of the reverse primer)
•With 32 cycles, over 1 billion (232) molecules of DNA representing a
specific locus (and only that locus) are synthesized and now dwarf the
rest of the DNA of the sample.
Single stranded
(denatured) DNA,
original template
A G C A T A A T T C A A T G A A T G A A T G C G T A C C T A
T C G T T T T A A G T T A C T T T G T T A C G C A T G C A T
Newly synthesized DNA
Newly synthesized DNA
A G C A T A A T T C A A T G A A T G A A T G C G T A C C T A
Single stranded
(denatured) DNA,
original template
T C G T T T T A A G T T A C T T T G T T A C G C A T G C A T
Multiplex STR Analysis
•
•
•
•
Original DNA
Template
PCR Products
•
Because the primers of each locus have been
stringently designed to be specific for only
regions before and after their locus, over 10
loci can be copied at once in one tube
Sensitivities to levels less than 1 billionth of a
gram of DNA are possible
Different fluorescent dyes are used to
distinguish STR loci with overlapping size
ranges
Generally, the result (if using 13 STR loci and
the sample is pure) is a mixture of as few as 13
and as many as 26 PCR products representing
13 STR loci (13 products if at EVERY locus, the
individual is homozygous, or at each locus, the
same number of repeats (same size) is present
– OR – 26 PCR products if at every locus the
individual is heterozygous, or at each locus, a
different number of repeats (different size) is
present.
If the sample contains a mixture, more pieces
will be seen.
Available Kits for STR Analysis
• Kits make it easy for labs to just add DNA
samples to a pre-made mix
• 13 CODIS core loci
– Profiler Plus and COfiler (PE Applied Biosystems)
– PowerPlex 1.1 and 2.1 (Promega Corporation)
• Increased power of discrimination
– CTT (1994): 1 in 410
– SGM Plus™ (1999): 1 in 3 trillion
– PowerPlex ™ 16 (2000): 1 in 2 x 1017
STR Analysis
•Once PCR has successfully been completed using any of the kits available, the products
must be analyzed
•One method used is capillary electrophoresis (CE), which involves injecting the PCR
products through a thin capillary
•Smaller sized fragments will move faster, and thus reach the fluorescence detector first.
•The wavelengths emitted by each fluorescent dye is different and can be monitored.
•Because it is known which fluorescent dyes are used for each locus, and it has been
controlled that loci containing similar size fragments use DIFFERENT dyes, each product
can be identified as it is detected.
•Standards are included that contain the known sizes produced at the various loci
• The fluorescence detection results in peaks representing different sizes and intensities
•The amelogenin locus, while not an STR, is included for gender determination
•A female is homozygous at the AMEL locus (X, X) and thus will display one peak
•A male is heterozygous at the AMEL locus (X, Y) and thus will display two peaks
An Example Forensic STR Multiplex Kit
AmpFlSTR® Profiler Plus™
Kit available from PE Biosystems (Foster City, CA)
200 bp
Color Separation
100 bp
Size Separation
D3
A
vWA
D8
D5
FGA
300 bp
400 bp
5-FAM (blue) dye
D21
D18
JOE (green) dye
D13
D7
NED (yellow) dye
ROX (red)
GS500-internal lane standard
9 STRs amplified along with sex-typing marker amelogenin in a single PCR reaction
Human Identity Testing with Multiplex STRs
Two different individuals
AmpFlSTR® SGM Plus™ kit
Smaller
Homozygous at THO1 DNA Size (base pairs)
Heterozygous at D16
fragmentsD3
(1
peak)
TH01
Larger
amelogenin
D8
(2 peaks)
VWA
D16
D19
fragments
D21
D18
D2
FGA
amelogenin D3
Male (2 peaks)
D19
Female (1 peak)
D8
VWA
TH01
D16
D21
FGA
D18
Simultaneous Analysis of 10 STRs and Gender ID
D2
Example of STR Allele
Frequencies
45
40
TH01 Marker
Frequency
35
30
Caucasians (N=427)
Blacks (N=414)
Hispanics (N=414)
25
20
15
10
*Proc. Int. Sym. Hum. ID
5
(Promega) 1997, p. 34
0
6
7
8
9
9.3
Number of repeats
10
Probability Estimates
•
•
•
•
•
Referring to the previous slide, the graph represents the frequency of a set of repeats at
the THO1 locus.
While it is known that the number of repeats (comprised of the tetrameric sequence
AATG) varies from 3 to 14, only the repeats of 6 to 10 are represented here.
Generally, if only using this graph as the basis for probability estimates, the frequency of
each allele (repeat number) compared to the total number of samples used (427 for
Caucasians, 414 for African Americans and Hispanics) would be used to calculate the
probability estimate of THAT allele for THAT locus in THAT specific population
This is repeated for all other alleles in each population, thus constructing probability
estimates of specific alleles for each of the 13 CODIS loci for each ethnicity
**NOTE** How is a repeat of 9.3 possible???
–
–
–
It has been observed that in some loci, repeats cannot be entirely complete; in a stretch of DNA
containing an STR, most of the repeats are tetramers (4 nucleotides), but within these, a portion
of a repeat (e.g. 2 or 3 of the 4 nucleotides of the repeat) may be present
AATG|AATG|AATG|AATG|AATG|AATG|ATG|AATG|AATG|AATG
Because the 3-nucleotide fragment occurs within the stretch of 4-nucleotide repeats, it is
included as part of the STR, but the notation is different
In the above example, the number of repeats is represented as 9.3, because there are 9 intact
tetrameric repeats with one partial repeat of 3 nucleotides “.3”.
Probability Estimates
Databases of the frequencies of each number of repeats of each locus in a
given population are used to calculate probabilities. Probability estimates
at each locus are multiplied by estimates at other loci to afford an overall
probability estimate
Because as many as 13 loci are analyzed and compared to questioned
samples, probability estimates can reach over 1 in a trillion, eliminating
basically everyone on the globe, underscoring the power of STR analysis.
Therefore, the more loci examined, the more powerful the analysis.
Download