ppt

advertisement
Lecture 16: Individual Identity and
Paternity Analysis
March 7, 2014
Last Time
 Interpretation of F-statistics
 More on the Structure program
 Principal Components Analysis
 Population assignment
 Individual identity (in lab)
Today
 Population assignment examples
 More on forensic evidence and individual identity
 Introduction to paternity analysis
Population Assignment: Likelihood
 "Assignment Tests" based on allele
frequencies in source populations and
genetic composition of individuals
P(G | H1 )
L( H1 , H 2 | G )  LR 
,
P(G | H 2 )
Pk l  p
2
il
for homozygote AiAi in
population l at locus k
Pkl  2 pil p jl
for heterozygote AiAj in
population l at locus k
m
P(G | H n ) = Õ Pk
k=1
for m loci
 Individual likelihood often summarized as:
-log10(P(G|Hn))
 Low numbers mean higher probability of Hn
 Likelihoods can be plotted against each
other
Population Assignment Example: Wolf Populations in Northwest
Territories
 Wolf populations sampled on island and
mainland populations in Canadian
Northwest Territories
-log likelihood from Mainland
 Immigrants detected on mainland (black
circles) from Banks Island (white circles)
-log likelihood from Banks Island
Carmichael et al. 2001 Mol Ecol 10:2787
Population Assignment Example:Fish Stories
 Fishing competition on Lake
Saimaa in Southeast Finland
 Contestant allegedly caught a 5.5
kg salmon, much larger than
usual for the lake
 Officials compared fish from the
lake to fish from local markets
(originating from Norway and
Baltic sea)
 7 microsatellites
Lake Saimaa
 Based on likelihood analysis, fish
was purchased rather than
caught in lake
-
Market
Individual Identity: Likelihood
 Assume you find skin cells and blood under fingernails of
a murder victim
 A hitman for the Sicilian mafia is seen exiting the
apartment
 You gather DNA evidence from the skin cells and from
the suspect
 They have identical genotypes
 What is the likelihood that the evidence came from the
suspect?
P(G | H1 )
L( H1 , H 2 | G )  LR 
,
P(G | H 2 )
 What is H1 and what is H2?
Match Probability
 Probability of observing a genotype at locus k by
chance in population is a function of allele frequencies:
Pk  p
Pk  2 pi p j
Homozygote
Heterozygote
2
i
m
P   Pk
for m loci
k 1
 Assumes unlinked (independent loci) and HardyWeinberg equilibrium
Probability of Identity
 Probability 2 randomly selected individuals have same
profile at locus k:
PIDk   pi4   (2 pi p j ) 2
i
Homozygotes
i
i j
Heterozygotes
m
P   PIDk
for m loci
k 1
 Exclusion Probability (E): E=1-P
What if the slimy mob defense attorney argues that
the most likely perpetrator is the mob hitman’s
brother, who has conveniently “disappeared”?
Does the general match probability apply to near
relatives?
Probability of identity for full sibs
Homozygotes
PIDhosibk
2 alleles IBD
1
2
 (1  2 pi  pi )
4
0 alleles IBD
2 alleles IBD
1 allele IBD
Heterozygotes
0 alleles IBD
1
PIDhesibk  (1  pi  p j  2 pi p j )
4
General Probability of Identity for Full Sibs:
PIDsibk
2


1
1

4
2
2
 1   pi    pi    pi  
4
i
 2  i
 i
 
Probability of identity for full sibs
PIDsibk
2


1
1



4
2
2
 1   pi    pi    pi  
4
i
 2  i
 i
 
Probability of identity unrelated individuals
PIDk   p   (2 pi p j )
4
i
i
i
2
i j
For a locus with 5 alleles, each at a frequency of 0.2:
PID = 0.072
PIDsib = 0.368
What is minimum probability of
identity for full sibs?
PIDsibk
2


1
1



4
2
2
 1   pi    pi    pi  
4
i
 2  i
 i
 
NRC (1996) recommendations
 Use population that provides highest probability of
observing the genotype (unless other information is
known)
 Correct homozygous genotypes for substructure within
selected population (e.g., Native Americans, hispanics,
African Americans, caucasians, Asian Americans)
 No correction for heterozygotes
P'  [ pi2  pi (1  pi )FST ] 2 pi p j
Homozygotes
Heterozygotes
Why is it ‘conservative’ (from the standpoint of
proving a match) to ignore substructure for
heterozygotes?
HT  H S
FST 
HT
H S = HT (1- FST )
Example: World Trade Center Victims
 Match victims using
DNA collected from
toothbrushes, hair
brushes, or relatives
 Exact matches not
guaranteed
 Why not?
 Use likelihood to
match samples to
victims
A series of little NBA prospects are born to ardent
basketball fans in every city with an NBA team. The
mothers regularly allege that the fathers are NBA
stars from visiting teams. The “players” deny this
allegation.
Can this be resolved using molecular markers and
population genetics methodologies?
Paternity Exclusion Analysis
 Determine multilocus genotypes of all mothers, offspring, and
potential fathers
 Determine paternal gamete by “subtracting” maternal genotype
from that of each offspring.
 Infer paternity by comparing the multilocus genotype of all gametes
to those of all potential males in the population
 Assign paternity if all potential males, except one, can be excluded
on the basis of genetic incompatibility with the observed pollen
gamete genotype
 Unsampled males must be considered
Paternity Exclusion
 First step is to determine
paternal contribution
based on seedling alleles
that do not match mother
 Notice for locus 3 both
alleles match mother, so
there are two potential
paternal contributions
 Male 3 is the putative
father because he is the
only one that matches
paternal contributions at all
loci
Locus 1
NO
NO
YES
YES
YES
YES
NO
YES
YES
YES
Locus 2
NO
NO
Locus 3
YES
NO
NO
Parentage Analysis: Paternity Exclusion
 Determine multilocus genotypes of all mothers, offspring,
and potential fathers
 Determine paternal gamete by “subtracting” maternal
genotype from that of each offspring.
 Infer paternity by comparing the multilocus genotype of
all gametes to those of all potential males in the population
 Assign paternity if all potential males, except one, can be
excluded on the basis of genetic incompatibility with the
observed pollen gamete genotype
 Unsampled males must be considered
Paternity Exclusion Analysis
Possible outcomes:
Consequences:
Female
 Only one male
cannot be
excluded
Male
Male
 Paternity is
assigned
 Analyze more loci
 More than one
male cannot
be excluded
Female
Male
Male
Male
 All males are
excluded
Female
Male
?
Male
 Conclude there is
migration from
external sources
Probabilities of Paternity Exclusion, Single Locus, 2 alleles,
codominant
 The paternity exclusion probability is the sum of the probability of all
exclusionary combinations
Hedrick 2005
Sum: Prk  p1 p2 (1  p1 p2 )
Probability of a falsely accused male of not matching
for at least one of m loci:
m
Pr  1   (1  Prk )
k 1
See Chakraborty et al. 1988 Genetics 118:527 for a more general calculation of exclusion power
Alleles versus Loci
 For a given number of alleles: one locus with
many alleles provides more exclusion power
than many loci with few alleles
10 loci, 2 alleles, Pr = 0.875
1 locus, 20 alleles, Pr=0.898
 Uniform allele frequencies provide more
power
Characteristics of an ideal genetic marker for
paternity analysis
 Highly polymorphic, (i.e.
with many alleles)
 Codominant
 Easy to use for genotyping
large numbers of individuals
 Mendelian or paternal
inheritance
0.90
0.85
0.80
0.75
0.70
0.65
9
0.60
8
7
0.55
0.50
10
6
7
6
5
4
5
Allele
4
s
3
3
2
2
Lo
ci
 Low cost
0.95
bility
Exclusion Proba
 Reliable
1.00
Shortcomings of Paternity Exclusion
 Requiring exact matches for potential fathers is
excessively stringent
Mutation
Genotyping error
 Multiple males may match, but probability of match may
differ substantially
 No built-in way to deal with cryptic gene flow: case when
male matches, but unsampled male may also match
Type I error: wrong father assigned paternity)
Advantages and Disadvantages of Likelihood
 Advantages:
 Flexibility: can be extended in many ways
- Compensate for errors in genotyping
- Incorporate factors influencing mating success: fecundity,
distance, and direction
 Compensates for lack of exclusion power
- Fractional paternity
 Disadvantages
 Often results in ambiguous paternities
 Difficult to determine proper cutoff for LOD score
Summary
 Direct assessment of movement is best way to measure
gene flow
 Parentage analysis is powerful approach to track
movements of mates retrospectively
 Paternity exclusion is straightforward to apply but may
lack power and is confounded by genotyping error
 Likelihood-based approaches can be more flexible, but
also provide ambiguous answers when power is lacking
Download