ppt

advertisement
Lecture 16: Introduction to Linkage
Disequilibrium
October 19, 2015
Exam 2
u
Wednesday, October 28 at 6:30 in lab
u
Genetic Drift, Population Structure,
Population Assignment, Individual
Identity, Paternity Analysis, and Linkage
Disequilibrium
u
Sample exam posted on website
u
Review on Monday, October 26
Last Time
u Population structure and gene flow
u Introduction to paternity analysis
Today
Multiple loci and independent segregation
Estimating linkage disequilibrium
Causes of linkage disequilibrium
Extending to Multiple Loci
So far, only considering dynamics of alleles at single loci
Loci occur on chromosomes, linked to other loci!
“The fitness of a single locus ripped from its interactive context is about as relevant to real
problems of evolutionary genetics as the study of the psychology of individuals isolated
from their social context is to an understanding of man’s sociopolitical evolution”
Richard Lewontin (quoted in Hedrick 2005)
Size of region that must be considered depends on Linkage
Disequilibrium
Gametic (Linkage) Disequilibrium (LD)
Nonrandom association of alleles at different loci into
gametes
Haplotype: Genotype of a group of loci in LD
LD is a major factor in evolution
LD itself provides insights into population history
Estimation of LD is critical for ALL population genetic data
Nomenclature and concepts
Two loci, two alleles
 Frequency of allele i at locus 1 is pi
 Frequency of allele i at locus 2 is qi
p1
A1
B1
q1
p2
A2
B2
q2
n
n
 p  q
i 1
i
i 1
i
1
Nomenclature and concepts
Genotype is written as
A 1 B 1 A2 B 2
A1
B1
A2
B2
A1 and B1 are in coupling phase
A1 and B2 are in repulsion phase
A1 B1 and A2 B2 are haplotypes
Gametic Disequilibrium
Easiest to think about physically linked loci, but not
necessarily the case
A1 B1 A2 B2
Meiosis
A1 B1
A1 B2
A 2 B1
A2 B2
What Are
Expectedp Frequencies
ofq Gametespinq a Population
p1q1
q
p
1 2
2 1
2 2
Under Independent Assortment?
What are expected frequencies of gametes with
complete linkage?
p1
A1
B1 q1
p2
A2
B2
q2
p1
A1
B2
q1
p2
A2
B1
q2
Meiosis
A1 B 1
A1 B2
A2 B1
A2 B2
x11
x21
x22
x12
The frequency of the gametes in the current population. Expected to
stay stable in the absence of other departures from H-W
Linkage disequilibrium measure, D
Independent
Assortment:
With Linkage
Disquilibrium:
Substituting p1 and q1
from above table:
D  x11x22  x12 x21
Problem: D is sensitive to allele frequencies
Gamete frequencies must be
between 0 and 1
Maximum |D| set by allele
frequencies
Solution: D' = D/Dmax
ranges from -1 to 1
Example, if D is positive:
p1=0.5, q2=0.5,
Dmax=0.25
but
p1=0.1, q2=0.9,
Dmax=0.09
Dmax Calculation:
If D is positive, Dmax is lesser of p1q2 or
p2q1
If D is negative, Dmax is lesser of p1q1 or
p2q2
LD can also be estimated as correlation between
alleles
r
2
D
p1 p2 q1q2
r can also be standardized to a -1 to 1 scale
It is equivalent to D’ in this case
r' 
D
p1 p2 q1q2
 D'
Dmax
p1 p2 q1q2
Recombination
Shuffling of parental alleles during meiosis
A 1 B 1 A2 B 2
A1
B1
A1
B2
A2
B2
A2
B1
Occurs for unlinked loci and linked loci
Rate of recombination for linked markers is
partially a function of physical distance
Recombination Rate
A 1 B 1 A 2 B2
Meiosis
A1 B 1
Coupling
nr
c
nr  nc
A1 B2
A2 B1
A2 B2
Repulsion
Repulsion
Coupling
Products of Recombination
Where nr is number of repulsion phase gametes, and
nc is number of coupling phase gametes
What is the expected recombination rate for unlinked loci?
Expected Gamete Frequencies: Double
Homozygote
A1 B1
A1 B 1
Meiosis
A1 B 1
NonRecombinant
A 1 B1
A1 B1
A1 B 1
Recombinant Recombinant
NonRecombinant
Expected Gamete Frequencies: Double
Heterozygote
A1 B1
A2 B2
Meiosis
A1 B 1
NonRecombinant
A 1 B2
A2 B1
A2 B 2
Recombinant Recombinant
NonRecombinant
LD is partially a function of recombination rate
Expected proportions of haplotypes produced in a
population after 1 generation of mating
Offspring
Genotypes
Parent Haplotype Offpsring Haplotype Frequencies (accounting for parental recombination)
Frequencies
Where c is the recombination rate
and D0 is the initial amount of LD
Recombination degrades LD over time
D1  x'11 x'22  x'12 x'21
= (x11 - cD0 )(x22 - cD0 )-(x12 + cD0 )(x21 + cD0 )
D1  (1  c) D0
Dt  (1  c) D0
t
 ct
Dt  e D0
Where t is time (in generations) and
e is base of natural log (2.718)
Effects of recombination rate on LD
Decline in LD over time
with different theoretical
recombination rates (c)
Even with independent
segregation (c=0.5),
multiple generations
required to break up
allelic associations
 ct
Dt  e D0
Where t is time (in generations) and
e is base of natural log (2.718)
LD varies substantially across human genome
NATURE|Vol 437|27 October 2005
Average r2 for pairs of SNP separated by 30 kb in 1 Mb windows
LD affected by location relative to telomeres and centromeres, chromosome
length, GC content, sequence polymorphism, and repeat composition
Highest and lowest levels of LD found in gene-rich regions
Human HapMap Project and Whole Genome Scans
NATURE|Vol 437|27 October 2005
LD structure of human Chromosome 19 (www.hapmap.org)
 1 common SNP genotyped every 700 bp for 270 individuals (3.4 million SNP)
 9.2 million SNP in total
LD in the Poplar Genome
LD declines rapidly with distance
LD higher in genes than in genome as a whole
Loci separated by kilobases still in LD!
1
0.5
Genomewide (core of range)
Genes (core of range)
1
0.4
3
2
0.3
r2
2
4
0.2
5
1
0.1
3
2
0.0
0
5
10
15
20
Distance (kb)
Slavov et al. 2012 New Phyt 196:713-725
Recombination Across Poplar Chromosomes
Substantial variation in
recombination rate
Related to repeat
composition, methylation,
and distance from
centromere
Recombination rate varies among individuals
Rate is often higher in females than males
Rate varies among individuals within males and females
Variation in recombination rate in the MHC region (3.3 Mb in
human sperm donors
Genetic Drift and LD
Begin with highly diverse haplotype pool
Drift leads to chance increase of certain haplotypes
Generates nonrandom association between alleles at different
loci (LD)
Genetic Drift and LD
Why doesn’t recombination reduce LD
in this situation?
LD is partially a function of recombination rate
Expected proportions of gametes produced by various
genotypes over two generations
Effective LD increases with homozygosity
Double heterozygote is only case where
recombination matters
Effect of Drift on LD
Drift and recombination will have opposing effects on LD
1
E(r ) =
1+ 4N ec
2
4Nec is “population
recombination rate”,
Expression approaches 0 for large
populations or high
recombination rates
Where
r2 is the squared correlation
coefficient for alleles at two
loci,
Ne is effective population size,
and
c is recombination rate
Combined effects of Drift and Recombination
LD declines as a
function of
population
recombination rate
Effects of chance
fluctuation of
gamete frequencies
Nec
How should inbreeding affect linkage
disequilibrium?
Download