Biology of STRs

advertisement
Biology of STRs
Artifacts in Genotyping STRs
• A number of artifacts are possible:
– Stuttering
– Non-template additions
– Microvariants
– Three peaks
– Allele dropouts
– Mutations
• All interfere with reading a DNA profile
accurately and consistently
Stuttering
• Stuttering is caused by the very structure
of the STRs that make them good markers
• They are repeats
• That are highly polymorphic
• Stutter product is a band that has the
wrong number of repeats
• Either one repeat more or one less
• Caused by strand slippage
Strand-slippage
5’ ATGCGGCGGCGTGTGTGTGTGGCG
3’ TACGCCGCCGCACACACACACCGCCG
GTGT
GT
GTGT
GC
5’ ATGCGGCG
3’ TACGCCGCCGCACACACACACCGCCG
DNA Replication
Or PCR
GT
5’ ATGCGGCGGCGTGTGTGT
3’ TACGCCGCCGCACACACACACCGCCG
Misalignment
GT
5’ ATGCGGCGGCGTGTGTGTGTGGCGGC
3’ TACGCCGCCGCACACACACACCGCCG
Elongation
Strand Slippage
• Occurs during extension step of PCR
• The newly formed strand of DNA skips
one repeat unit – starts complementary
base pairing with next repeat
• Pushing out a non-base paired loop from
the template strand of DNA
• Usually causes a deletion of one repeat
unit – therefore band will be one unit
smaller than true genotype
Strand Slippage
• Naturally this is the mechanism that
makes repeats polymorphic
• When it happens during PCR it can
produce a band that is not real:
– Genotype will be wrong
– One repeat unit lower or higher than reality
• Rarer in Tetranucleotides than any other
repeats – which is why tetra’s are used
Amount of Stutter Product
• Stutter is usually rare
• Therefore might show a small bump - can
usually be differentiated from a true band
• Earlier in PCR reaction strand slips
– More stutter product will be produced
• Or if genotyping protocol doesn’t work well
true band may be very low
– Difficult to separate stutter band from true
band
Stutter Products
Call these genotypes:
Stutter
Stutter
Stutter ?
Calling Alleles
• Biggest problem with stutter bands:
– They are the same size as a real allele!
• Especially difficult if you know the DNA
sample is mixed
• Or you are unsure whether sample has
been contaminated
• Difficult to determine:
– Stutter band
– Minor allele (because less DNA)
13 CODIS STR Loci
• All produce some stutter products
• Longer alleles produce more stuttering
– Why does this make sense?
• Stutter percentages for Tetranucleotides:
– From Less than 1 %
– Up to 15% - of the true allele size
– Therefore always calculate percentage of
small band’s peak height
– Be sure < 15% height of large band
Reducing Stuttering Products
• Changing PCR conditions
• Faster DNA Polymerase
– Faster it works, less chance for slippage
• STRs with longer repeats (> 4 bps)
– More difficult to “skip” past repeat
• STRs with imperfect repeat units
– Complex and compound repeats
– More difficult to skip past repeat if next repeat
unit sequence is different
Summary of Stutter Products
• One repeat unit more or less than real
allele peaks
• Less then 15% real allele height
• Quantity of stutter band depends on:
– When in PCR reaction first slippage occurs
– Allele size (bigger alleles, more stutter)
– PCR Conditions
– Polymerase used
– Repeat length and sequence
Non-Template Additions
• Polymerase often adds an extra
Adenosine to the end of the newly formed
sequence
• Not a part of the template sequence
• Makes PCR product one base longer than
actual sequence
• If your PCR reaction forms both +A and -A
products then your band will be wide
Non-Template Additions
• Want to have peaks as clear as possible
• Therefore want all PCR products to be
identical
• Either all +A or all -A
• Imagine case where you were genotyping
a dinucleotide, with stutter, and half the
products were +A and half were -A
• Impossible to separate genotypes
Non-Template Additions
• Set up PCR conditions so that every
product will be +A
• Conditions:
– Final extension for 10 mins
– Allows all products to be fully adenylated
– Primer ends in a guanosine
• Commercially available kits turn every
allele (and ladder) into +A
Overloading Sample
• Signal on gel is too strong – will be difficult
to call
• May result in a split peak
• Or a peak that is off scale
• Caused by:
– Too much DNA sample in PCR reaction
– Primer concentrations too high
• Why DNA quantification is so important
Non-Template Additions and
Overloading Samples
Relative Fluorescence (RFUs)
DNA Size (bp)
off-scale
-A
10 ng
template
(overloaded)
+A
D3S1358
VWA
FGA
2 ng template
(suggested level)
Figure 6.5, J.M. Butler (2005) Forensic DNA Typing, 2nd Edition © 2005 Elsevier Academic Press
Microvariants
• Remember these are variants of the
repeat that are not a full repeat unit
• Example – TH01 9.3 allele
• As opposed to stutter allele microvariants
are not same size as expected allele
• Problem is – determining whether there is
a true microvariant in the person
• Or you are seeing a normal band being
shifted over for some genotyping reason?
Microvariants
1. True microvariants must be validated to
happen in many samples
•
Even if variant is rare it must show up in
more than one individual to be considered a
true microvariant
2. Exact distance in base pairs should be
calculated
•
•
9.3 means 9 repeats plus 3 bases
Always calculate in bases exactly how “off”
the microvariant is
Sequence Microvariants
• Sometimes there are also sequence
differences in these polymorphisms as
well as length differences
• The only way to genotype a sequence
variant is to sequence the PCR product
• Not necessary for Forensics because you
are simply matching genotypes
• These variants are not important for
Forensics analysis
Peaks outside of the Ladder
• Sometimes you will see a peak that it
outside of the expected range for any
marker (between markers?)
• What could cause this?
– Unsuccessful PCR product
– Primer dimers or etc.
– Person really has a new allele
• Check with different set of primers
• Sequence new allele and region
Three Peaks
• Sometimes three bands may be seen
• What could cause three bands?
– Stuttering
– Mixed or contaminated samples
– Genotyping error
– True duplication or extra chromosome in the
individual
• Need to validate what is seen in gel
Three Peaks
1. Check other markers in panel:
1. Is there evidence of mixed or contaminated
samples in any other markers?
2. Check database information for this
marker:
1. More than 50 tri-allelic patterns have been
reported as possible with 13 CODIS loci
3. Sequence or genotype this region:
1. Is there truly a duplication or extra
chromosome in this person?
Allele Dropout
• Most worrisome problem
• May call a person homozygous when
really they are heterozygous
• What can this be caused by?
– Larger allele is not amplified successfully
– Primer site mutation
• Rare with chosen tetranucleotides:
– Alleles are very similar in size
– Primers have been optimized and chosen in
regions that are very stable
Avoiding Allele Dropout
• Chose primers carefully
• Work with polymorphisms that have alleles
of similar size
• Always check genotypes with HardyWeinberg Equation
– Make sure you see the expected number of
heterozygotes population wide
• Most commercial kits have taken care of
all these issues
“Fixing” Allele Dropouts
• Add a “degenerate” primer
– Extra primer with known polymorphism
– Three primers total will be added
• Lower annealing temperature
– Reduce the stringency of primer binding
• Remember that with Forensics what
matters is matching genotypes
– As long as allele always drops out, don’t have
a problem
Mutations
• STRs do mutate at an expected mutation
rate over time
• Mutation may cause:
– New Alleles
– Change primer binding regions
– Sequence changes (less important)
• Very rare events
• Can be validated by examining families
Mendelization of Alleles
• Using family members to determine which
alleles are possible
• If you know parent’s alleles then there are
only so many genotypes possible for
children
• Mendel’s law of segregation
• All STRs have been genotyped on CEPH
families – huge family sets from Utah
Mendelization of Alleles
8/12
3/8
3/14
8/14
3/12
5/11
2/9
2/11
10/11
• As always – must validate mutation
• By sequencing or regenotyping
Mutation Rates
• Mutations rates of 13 CODIS have been
calculated over thousands of meioses
• All 13 are between 1 to 5 per 1000
generational events
• Highest mutation rates:
– Markers that are most polymorphic
• Lowest mutation rates:
– Markers that are least informative
Impact of Mutations
• Paternity testing
– Can cause problems
– Because father may not match true child if
genotype has change in child
– Compare many STR loci
• Identity matching
– Will not cause a problem
– Because mutation will be consistent over a
person’s lifetime and in all tissues
Genotyping Errors
• All the previous were artifacts that can be
explained
• However the problems you really worry
about are unexplained errors
• Especially if sample may be:
– Contaminated
– Mixed samples
• Need to always validate any artifact
• Be sure it’s not genotyping error
Any Questions?
• Review Chapters 1 – 6
• Email me at least 2 questions you
have about the first 6 chapters
• Next class will be review for Exam
• Exam One – February 5th
Download