QTL Detection

advertisement
Day 2
QTL Detection
Objective
Present principles for detection of genes affecting
quantitative traits (QTL) using genetic markers
in ‘simple’ experimental designs
Concepts covered relevant to issues in ‘genomic selection’
1.
2.
3.
4.
5.
6.
Single locus quantitative genetic model
Principle of use of LD to detect QTL using markers
Overview of strategies for QTL detection
QTL detection using line crosses
QTL interval mapping in line crosses
QTL detection in line crosses – additional topics
a. Significance testing
b. Accuracy of position estimates
c. Breed crosses (vs inbred line crosses)
7. QTL detection in outbred populations – linkage analysis
8. Summary and limitations
9. Software for QTL mapping
1
1. Single locus Quantitative Genetic Model
• Partition phenotype into genetic and environmental
components:
P = mean + G + E
• G = collective effect of many genes
= quantitative trait loci (QTL)
• Genotypes for QTL have an associated genotypic value:
GT = E( P | T )
GT = phenotype you expect to get
from an individual with genotype T
GT = Average phenotype over all individuals
with genotype T
GT is often deviated from the mean Î overall average GT is zero
2
1
Falconer Model for effects of QTL
Genotype T
A2A2
A1A2
A1A1
Genotypic value GT
μ–a
μ μ+d
μ+a
μ is NOT the population mean - it is the “mid-homozygote” value.
- it is often standardized to zero (by subtraction)
T
Under HWE:
Frequency,
f( T)
Genotypic
value, GT
f(T ) x GT
A1A1
p2
a
p2a
A1A2
2pq
d
2pqd
A2A2
q2
–a
–q2a
Population mean = E(GT) = M = p2a + 2pqd + –q2a
= a(p – q ) + 2pqd
3
Example
The pygmy gene in mice
Allele frequency: Pr(+) = p = 0.7 q = 0.3
++
+ pg
pg pg
Average weight (gr):
14
12
6
Genotypic value GT
a =4
d =2
–a = –4
Genotype:
Expected freq.
under HWE:
2
p = 0.49
2pq = 0.42
Î μ = 14+6 = 10
2
2
q = 0.09
Mean GT = E(GT) = M = 0.49*4 + 0.42*2 + 0.09*(-4) = 2.44
= a(p – q ) + 2pqd = 4(0.7-0.3) + 2*0.7*0.3*2 = 2.44
Expected population mean = 0.49*14 + 0.42*12 + 0.09*6 = 12.44 = μ + E(GT)
Most QTL have much smaller effects than the mouse pygmy gene
and cannot be observed directly
4
2
How can we find these QTL?
Since we cannot observe the QTL directly,
we want to use (or create) an association
between the QTL and something we CAN
observe:
A genetic marker…
2. Principles of the use of LD
to detect QTL using markers
5
Molecular Genetics
“In Search of the Holy Grail”
M Q
M Q
M Q
m q
m q
m q
Major genes
Quantitative
Trait
Loci (QTL)
= position (locus) on
genome associated
with genetic
differences for a
quantitative trait 6
3
Most QTL cannot be observed at DNA level
Two types of observable molecular
genetic loci
• Functional mutations - known genes
• Most beneficial and easy to use
• Difficult to find
Q
q
M
• Anonymous markers linked to QTL
• Easier to find
m
• More restrictive and difficult to use
Q
Use of markers for QTL detection and
MAS relies on association of markers
with phenotype
QTL detection
M
Q
m
q
Marker
Genotype
MM
Mm
mm
Mean
Phenotype
20
18
14
MAS
q
Allele M is
associated with
favorable QTL
allele
Select MM or individuals that inherited allele M
Requires Linkage Disequilibrium between
marker and QTL
8
4
QTL has effect on phenotype, marker does not
Illustration that marker genotype means don’t differ
if marker and QTL are in Linkage Equilibrium
Allele frequencies:P(M)=pM P(m)=qM P(Q)=p P(q)=q
D=0
Genotypic
Frequency
value
M
Q
M
Q
m
Q
μ+a
M
Q pM2p2
m
Q 2pMqMp2
m
Q qM2p2
μ+d
μ+d
μ-a
M
Q
M
q
M
q
M
Q
M
q
M
q
pM2pq
pM2pq
pM2q2
Average μ+a(p-q)+2pqd
M
Q
m
q
M
q
m
Q
M
q
m
q
2pMqMpq
2pMqMpq
2pMqMq2
μ+a(p-q)+2pqd
m
Q
m
q
m
q
m
Q
m
q
m
q
qM2pq
qM2pq
qM2q2
μ+a(p-q)+2pqd 9
Illustration that marker genotype means don’t differ
if marker and QTL are in Linkage Equilibrium
Allele frequencies: P(M)=pM P(m)=qM P(Q)=0.7 P(q)=0.3 D=0
Genotypic
Example
value
Q
M
M
m Q
QFrequency
10
M
m Q 2pMqM(.49)
m Q qM2(.49)
Q pM2(.49)
8
8
5
Average
M
Q
M
q
M
q
M
Q
M
q
M
q
pM2(.21)
pM2(.21)
pM2(.09)
.49*10+.21*8+.21*8+.09*5=8.71
M
Q
m
q
M
q
m
Q
M
q
m
q
2pMqM(.21)
2pMqM(.21)
2pMqM(.09)
m
Q
m
q
m
q
m
Q
m
q
m
q
qM2(.21)
qM2(.21)
qM2(.09)
.49*10+.21*8+.21*8+.09*5=8.71 .49*10+.21*8+.21*8+.09*5=8.71
10
5
Detection of QTL based on markers requires
Linkage Disequilibrium between marker and QTL
Relative frequency of Q must differ between marker genotypes
Example (arbitrary)
Allele frequencies:
P(M) = pM =0.4
P(Q) = p =0.7
P(m) = qM =0.6
P(q) = q = 0.3
Assumed
Haplotype frequencies
M Q
0.38 = pMp + D
M q
0.02 = pMq - D
m Q
0.32 = qMp - D
m q
0.28 = qMq + D
8
8
5
Average
= 0.38-(0.4)(0.7) = +0.1011
Example
D=+0.10
Random mating of parents
Genotypic
value
10
Disequilibrium = D
= P(MQ) – pMp
M
Q Frequency
M
Q
M
Q
m
Q
M
Q
M
Q
M
q
m
q
M
q
M
q
M
Q
m
Q
M
q
M
q
M
q
m
q
(.38)(.38)
=.1444
(.38)(.02)
=.0076
(.02)(.38)
=.0076
(.02)(.02)
=.0004
9.80
2(.38)(.32)
=.2432
2(.38)(.28)
=.2128
2(.02)(.32)
=.0128
2(.02)(.28)
=.0112
8.94
m
Q
m
Q
m
Q
m
q
m
q
m
Q
m
q
m
q
(.32)(.32)
=.1024
(.32)(.28)
=.0896
(.28)(.32)
=.0896
(.28)(.28)
=.0784
7.92
12
6
3. Overview of Strategies for QTL Detection
Depend on the type of LD between markers and QTL
Strategies differ in the # of rounds of
that you want to exploit
recombination that occurred since
creation of LD and, therefore, in how
close a marker needs to be to be in
• LD you create by a cross
sufficient LD with a QTL
• F2 cross
• Backcross
• Advanced Intercross Line – AIL
• Recombinant Inbred Line – RIL
• LD that exists within families
• Within half-sib families
• In extended pedigree
outbred
F2/BC
• LD that is already present in an outbred population
• LD created in past by drift, mutation, selection, migration
r2
1
c=.001
0.9
0.8
c=.01
0.7
0.6
0.5
c=.05
0.4
0.3
c=.1
0.2
c=.2
0.1
c=.5
0
0
5
10
15
Generation
20
25
Type of LD used affects marker density required, type of
analysis needed, and how results are to be interpreted
13
Scope of QTL Detection Strategy
¾ Targeted – e.g. candidate gene approach
¾ Look for QTL in targeted region if the genome
¾ Genome-wide – genome scan approach
¾ Place markers across the genome
¾ Look for associations of markers with trait phenotype
across the genome
¾ Identify QTL across the genome
M1
M2
M3
Q
M4
M5
M6
m1
m2
m3
q
m4
m5
m6
14
7
r2
1
c=.001
Overview of Strategies for QTL mapping
0.9
0.8
c=.01
Outbred population
Line/breed cross
0.7
0.6
0.5
Linkage analysis
LD markers
c=.05
0.4
0.3
c=.1
0.2
c=.2
0.1
F2 / BC
c=.5
0
0
5
10
15
Generation
LD used
20
AIL
RIL
25
Population wide
Linkage analysis
LE markers
LD mapping
LD markers
HS/FS Extended Candidate High
pedigree genes density
families
Within family
Population wide
Recomb.
LD extent
Marker map
Scope
Map resol.
15
4. QTL detection in Line Crosses
Line crossing creates extensive
Linkage Disequilibrium
M Q
M Q
M MQ q
M Q
M QQ
M
m q
M Q
X
m q
m q
M Q
m q
M Q
m q
m q
m q
m q
m q
M Q
m q
M Q
M Q
M Q
m q
M Q
m q
m q
16
8
QTL detection in Backcross of Inbred Lines
M
Q
Parental lines
M
Q
q
m
q
c = recombination rate
c
M
q
m
m
X
Q
X
F1
m
q
m
Progeny
Back
cross
q
produced
M
Q μ+d
m
q μ-a
m
q
m
q
M
q μ-a
m
Q μ+d
Recombinants
m
q
m
q
Contrast YMm-Ymm
1/
2
(1-c)
1/
2
c
1/
2
Non-recombinants
(1-c)
1/
2
c
= (1-2c)(a+d)
Mean phenotype by marker genotype
YMm= μ - c a + (1-c)d Ymm= μ - (1-c) a + c d
BC has only 1 round
of recombination
Line crossing creates
extensive LD
r2
M Q
M Q
m q
M Q MM QQ
M Q
M MQ q
M Q
m q
M Q
m q
X
m q
m q
m q
m q
m q
m q
c=.001
0.8
c=.01
0.7
0.6
0.5
c=.05
0.4
0.3
m q
M Q
M Q
1
0.9
M Q
M Q
17
m q
M Q
m q
m q
Contrast YMm-Ymm = (1-2c)(a+d)
c=.1
0.2
c=.2
0.1
c=.5
0
0
5
F2/BC
10
15
Generation
20
25
Î marker doesn’t need to be
close to the QTL to show
an effect on phenotype
c = 0.2 Æ 1-2c = 0.6
Î marker with 0.2 rec.rate with QTL still shows 60% of QTL effect
General recommendation is a marker every 20 cM
18
Î each QTL is within 10 cM of a marker
9
F2 Cross between Inbred Lines
F2
Q μ+a
M
M
Q
M
m
Q μ+a
m
Q
q μ+d
m
q
M
Q
m
Q
M
Q μ+d
m
Q μ+d
M
q
m
q
1/
M
q μ-a
m
q
μ-a
M
1/
4
(1-c)(1-c)
1/
1/
4
4
q
c (1-c)
(1-c) c
1/
4
m
cc
M
q
1/
4
cc
c
Q
M
Q
M
Q
m
q
X
F1
X
m
q
m
q
M
Q
m
q
μ+d
1/
1/
4
4
(1-c) c
4
c (1-c)
Contrast YMM - Ymm
= 2(1-2c)a
(1-c)(1-c)
Expected mean of marker genotypes
YMM= μ +(1-c)2a+2c(1-c)d -c2a
Ymm= μ +c2a+2c(1-c)d -(1-c)2a
F2 Cross between Inbred Lines
M
Q
m
q
X
19
M
Q
m
q
M
Q μ+a
M
Q μ+a
m
Q μ+a
m
Q μ+a
M
Q
1/ (1-c)(1-c)
4
m
Q
c
M
Q
m
Q
M
q μ+d
M
q μ+d
m
q μ+d
m
q μ+d
M
Q
m
Q
cc
M
Q
m
Q
M
Q μ+d
M
Q μ+d
m
Q μ+d
m
Q μ+d
M
q
c
m
q
M
q
cc
m
q
M
q μ-a
M
q μ-a
m
q μ-a
m
q μ-a
M
q
m
q / c (1-c)
M q
2
YMm= μ +c d +(1-c)2d
m
q
1/
4
c (1-c)
1/ (1-c)
4
1/
4
cc
YMM= μ+(1-c)2a+2c(1-c)d -c2a
a(1-2c)=(YMM-Ymm)/2
1/ (1-c)
4
1/
4
1/ (1-c)(1-c)
4
1
4
1/
4
c (1-c)
1/ (1-c)(1-c)
4
1/
4
1/ (1-c)
4
c
1/
4
cc
1/ (1-c)
4
1/
4
c
c (1-c)
1/ (1-c)(1-c)
4
Ymm= μ+c2a+2c(1-c)d -(1-c)2a
d(1-2c)2=YMm - 1/2(YMM+Ymm)
20
10
M
m
c
Q
Summary
q
Backcross:
F2 cross:
Expectation if c = 0.5
(1-2c)(a+d) = YMm-Ymm
=0
(1-2c)a = (YMM-Ymm)/2
=0
(1-2c)2 d = YMm - 1/2(YMM+Ymm)
=0
Estimates confound QTL position and effect
E.g. if (YMM - Ymm) / 2 = 10 kg (F2 cross)
• QTL could be near M with a = 10 (if c=0)
• QTL could be distant (c=0.25) with a = 20
Marker-associated
• or any other possibility
effect = 10
• QTL can be on either side of the marker
21
But, if we test multiple markers
and find the following marker-associated effects:
(YMM -Ymm)/2 = (1-2c)a =
M1
M2
M3
M4
M5
5
10
10
5
2.5
there is evidence that the QTL is between M2 and M3
(although we cannot exclude presence of multiple QTL)
22
11
5. QTL Interval Mapping in Line Crosses
Use of flanking markers
To estimate QTL position and effect separately
c1
M
c2
Q
N
Backcross
m q n
X
m
q
n
m q n
θ = assumed known
Contrast YMm-Ymm = (1-2c1)(a+d)
Contrast YNn-Ynn = (1-2c2)(a+d)
Î 3 equations
3 unknowns c1, c2 , (a+d)
No interference Æ θ = c1 + c2 -2c1c2
23
c1
c2
M
Q
N
m
q
n
Backcross
Interval Mapping
m q n
X
To estimate QTL position and effect
separately
m q n
M
1/
2θ
M
m
1/
2θ
m
m
1/
2(1-θ)
m
Q
q
Q
q
Q
q
n
n
N
N
n
n
1/
2
(1-c1) c2
μ+d
1/
2
c1 (1-c2)
μ -a
1/
2
c1 (1-c2)
μ+d
1/
2
1/
2
1/
2
(1-c1) c2
c1
c2
Pr(Q|marker data) = XQ
QTL position
(1-c1)(1-c2)/(1-θ)
(1-c1) c2 /θ
c1 (1-c2)/θ
μ -a
μ+d
(1-c1)(1-c2) μ
c1
Use θ = c1 + c2 -2c1c2
θ F1 gametes and progeny
Frequency value
Frequency
M
Q
N 1
/2(1-c1)(1-c2) μ+d
1/ (1-θ)
2
M
q
N 1
/2 c1
c2 μ -a
c2 /(1-θ)
-a
24
12
E(Yi|Marker Genotype)
• Two possible QTL genotypes: Qq or qq
– If Qq, E(Yi|Qq) = μ + d
– If qq, E(Yi|qq) = μ – a
• Put those two together with P(Qq | gmarker) = XQi
and P(qq | gmarker) = 1 – XQi
• E(Yi | M) = (μ + d)XQi + (μ - a)(1 - XQi)
= (μ - a) + (a + d)XQi
=
m +
bQ XQi
Î Regression model: Yi = m + bQ XQi + e
Regression Interval Mapping
25
Estimate QTL position and effect separately
Haley and Knott (1992)
Heredity 69: 315
Backcross regression model
c1
Yi = m + bQ XQi + ei
E(bQ) = a+d
c2
M
Q
N
m
q
n
θ
12
Fit Model for various
positions of QTL
(e.g. in steps of 1 cM)
8
F-value
Position with lowest RSS
or highest F-test gives
best estimate of c1 and
bQ (=a+d)
10
6
4
2
0
0
10
20
30
Position (cM)
40
50
26
13
M
Q
X
M
Q
N
M
Q
N
m
q
n
F2
m
N
F1
X
gmarkers
MM NN
MM Nn
MM nn
Mm NN
Mm Nn
Mm nn
mm NN
mm Nn
mm nn
q n
m
q n
M
Q N
m
q n
c1
F2 Cross between
Inbred Lines
M
c2
Q
N
θ
Pr(QQ|gmarkers)
Pr(Qq|gmarkers)
Pr(qq|gmarkers)
f(c1,c2,θ)
f(c1,c2,θ)
f(c1,c2,θ)
27
F2 Cross between Inbred Lines
Haley and Knott (1992)
Heredity 69: 315
Additive coef. Dom. Coef.
Markers
MM
MM
MM
Mm
Mm
Mm
mm
mm
mm
NN
Nn
nn
NN
Nn
nn
NN
Nn
nn
Xadd
Pr(QQ) Pr(Qq) Pr(qq) Pr(QQ)-Pr(qq)
f(c1,c2,θ) f(c1,c2,θ) f(c1,c2,θ)
f(c1,c2,θ)
Yi = μ + baXadd,i + bdXdom,i + ei
E(ba) = a
Xdom
Pr(Qq)
f(c1,c2,θ)
at QTL position
E(bd) = d
Fitted at each 1 cM position on chromosome
Position with highest F-test Æ QTL (if significant)
28
14
29
6. QTL detection in line crosses
Additional Topics
(see also Lynch and Walsh ch 15)
a. Significance test for presence of QTL
b. Accuracy of position estimates
•
Advanced intercross lines
c. Breed Crosses (vs inbred line crosses)
30
15
6a. How to decide if you’ve detected a QTL?
Test statistic (e.g. F or LR) > threshold T
Set T to control the Type I error rate (False Positives)
• Comparison-wise test at 5% : set threshold T such that:
• Prob(test > T | no QTL) < .05
allow 5% FP tests
Possible outcomes for test for QTL at a given position:
True state
Ho is true (no QTL)
Result of significance test
Accept Ho
Reject Ho
True negative
False positive
Type I error
Ho is false
(QTL)
False negative
True positive
Type II error
31
Expected result for tests at 100 positions on chromosome
with NO QTL at 5% comparison-wise test level:
True state
Ho is true (no QTL)
Result of significance test
Accept Ho
Reject Ho
95
5
Type I error
Ho is false
(QTL)
0
0
Type II error
Î Significance testing complicated by:
• Large # tests performed (many markers, QTL positions)
• At α = 0.05, 5% of tests significant even if no QTL exist
• Tests on the same chromosome are dependent
• Bonferroni adjustment (α*= α/(# tests)) is too stringent
32
16
Strategies to control % false positives (%FP)
(Lander & Kruglyak, 1995, Nature Genetics 11: 241-247)
• Chromosome-wise test - control % FP at chrom. level
•
•
Account for multiple (correlated) tests on chrom.
# FP/chromosome > 1 on 5% of chromosomes
• Experiment-wise test - control %FP within experiment
•
•
Account for all tests conducted in experiment
# FP/experiment > 1 on 5% of experiments
• Genome-wise test - control % FP at genome level
• Account for all tests conducted on the genome
• # FP/genome > 1 on 5% of genomes tested
• Significance Levels
(Lander & Kruglyak, 1995)
• Significant Linkage at p < .05 : Prob(> 1 FP) < .05
• Suggestive Linkage : at least 1 false positive test
33
Computing significance thresholds
• Adjust Table test statistic values by equation of Lander &
Kruglyak (1995)
• Assumes high-density marker map
• Develop empirical threshold based on permutation test
(Churchil and Doerge, 1994, Genetics 138:963)
•
•
•
•
Simulate data under the Null Hypothesis (=no QTL)
Compute test statistic (F-test / LR)
Replicate many times
Determine 95 % level of tests statistic (for 5% test)
34
17
Significance thresholds by Permutation test (Churchill&Doerge, 1994 Genetics
138:963)
•
•
•
•
Simulate data under the Null Hypothesis (=no QTL)
Compute test statistic (F-test / LR)
Replicate many times
Determine 95 % level of tests statistic (for 5% test)
Randomly permuted data
Original data
Animal
Marker
Pheno-
Anima l
Ma rker
Pheno-
ID
Genotype
type
ID
Ge notype
type
1
Mmnn
9.8
1
MmNn
9.8
2
mmnn
10.4
2
mmNn
10.4
3
mmnn
9.3
3
Mmnn
9.3
4
5
6
7
8
9
10
Mmnn
MmNn
MmNn
MmNn
mmnn
MmNn
mmNn
8.5
11.3
9.6
9.9
7.6
8.0
10.7
4
5
6
7
8
9
10
MmNn
mmnn
MmNn
Mmnn
mmnn
MmNn
mmnn
8.5
11.3
9.6
9.9
7.6
8.0
10.7
95%
Test statistic under Null Hypothesis
Replicate
Distribution of test statistic 35
5%
Threshold
Control of False Discovery Rate (FDR)
True state
Ho is true
Result of significance test
Reject Ho
Accept Ho
U
V
Type I error
Ho is false
T
S
Type II error
FDR - Control the expected proportion of
significant tests that are false positives
- Control E(V (V+S))
/
36
18
Frequency Distribution of p-values across
many tests
400
Low FDR
H0 False
H0 True
300
high FDR
200
100
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
P-value
See notes “False discovery rate.doc” for further details
37
6. QTL detection in line crosses
Additional Topics
a. Significance test for presence of QTL
b. Accuracy of position estimates
•
Advanced intercross lines
c. Breed Crosses (vs inbred line crosses)
38
19
Replicate Genome Scan results for F2
N=500 6 markers Trait with SD=2
QTL at 23 cM a=1 d=0.5 --> 14% of variance
12
12
12
12
10
10
10
8
8
8
16
10
14
12
8
6
4
6
F-va lue
8
F -va lu e
6
F-va lue
F-va lue
F-va lue
10
6
6
4
4
4
2
2
2
0
0
4
2
2
0
0
0
10
20
30
40
0
50
10
20
30
40
0
50
10
12
14
12
10
20
30
40
0
0
50
10
20
30
40
50
0
10
20
30
40
50
Position (cM)
Position (cM)
Position (cM)
Position (cM)
Position (cM)
12
12
12
10
10
10
8
8
8
6
F -va lu e
F-va lue
F-va lue
F-va lue
8
6
6
4
F -va lu e
10
8
6
6
4
4
4
2
2
2
0
0
4
2
2
0
0
0
10
20
30
Position (cM)
40
50
0
10
20
30
40
Replicate
1
2
3
4
5
6
7
8
9
10
Average
St.dev.
TRUE
50
0
10
20
30
40
50
0
0
10
20
30
40
50
0
Position (cM)
Position (cM)
Position (cM)
10
20
30
Position (cM)
Position
a
15
0.791
0.19
24
1.56
0.19
23
27
20
28
22
13
17
29
1.03
0.771
1.201
1.35
0.96
0.991
0.94
0.924
0.3
0.24
0.93
0.94
0.14
0.64
0.52
1.44
21.8
5.231
1.052
0.236
0.55
0.41
23
1
0.5
40
50
39
d
40
20
F2 cross design
Line 2
Line 1
X
M M
F0
M m
F1
X
m m
M m
F2
Mm
MM
r2
1
mm
c=.001
0.9
0.8
• Large chunks
• High LD
• Only 1 round of recombination
Î low accuracy of QTL position
c=.01
0.7
0.6
0.5
c=.05
0.4
0.3
c=.1
0.2
c=.2
0.1
c=.5
0
0
F2/BC
5
10
15
20
Generation
25
41
Resolving Power of QTL Mapping
(Darvasi & Soller 1997. Behavior Genetics)
9 5 % C I fo r Q T L lo c a tio n
25
Approximate 95% confidence interval
for QTL location (cM) for a=.5σ p
20
~ 3000/kNa2
k=1 for BC k=2 for F2
15
N=population size
10
BC
5
F2
0
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
Mapping population size (actual)
Increase resolution with advanced intercross lines
• Recombination breaks genome up in smaller pieces
- reduces LD except at short distance
(Darvasi & Soller 1995 Genetics)
42
21
Strategies to increase accuracy of
estimates of QTL position in line crosses
F2/BC:
• Increasing marker density limited effect
• Increase population size
r2
1
c=.001
0.9
0.8
c=.01
0.7
0.6
0.5
c=.05
0.4
Advanced intercross lines
0.3
ÎHigher accuracy of QTL position
• Requires more markers
c=.1
0.2
c=.2
0.1
c=.5
0
0
F2/BC
5
AIL
10
15
Generation
20
25
to maintain power to detect QTL (lower LD)
43
Recent LD extends over large distances
r2
Generations of recombination
1
0.9
0.8
Gen 1
rt
sho ted
ver
a
LD o e if cre
o
anc
dist long ag Gen 100
0.7
0.6
0.5
0.4
0.3
0.2
0.1
Gen 2
LD
distan over long
ce if
recen created
tl y
Gen 5
Gen 10
Gen 20
Gen 50
0
0
5
10
15
Distance (cM)
20
25
30
44
22
r2
1
c=.001
Overview of Strategies for QTL mapping
0.9
0.8
c=.01
Outbred population
Line/Breed cross
0.7
0.6
0.5
Linkage analysis
LD markers
Linkage analysis
LE markers
F2 / BC
families
c=.05
0.4
0.3
c=.1
0.2
c=.2
0.1
c=.5
0
0
5
10
15
Generation
HS/FS
AIL
20
25
LD used
Population wide
Recomb.
1 rnd
>1 rnd
LD extent
Long
Smaller
Marker map Sparse
Coverage
Map resol.
Ext.
pedigree
LD mapping
LE markers
Cand.
genes
High
density
Denser
Genome wide
Poor
Better
45
6. QTL detection in line crosses
Additional Topics
a. Significance test for presence of QTL
b. Accuracy of position estimates
•
Advanced intercross lines
c. Breed Crosses (vs inbred line crosses)
46
23
QTL mapping in livestock
Using F2 cross between outbred breeds
Berkshire
x
Yorkshire
F2 cross
47
F0 2 Berkshire sires
M1 N1
BB
9 Yorkshire dams
YY M2 N2
x
M1 N1
F1
F2
M2 N2
BY
8 sires
BY 26 dams
x
M1 N1
M1 N1
M2 N2
M2 N2
525 BB
Breed origin
probabilities
BY
YB YY
M1 N1
M1 N1
M2 N2
M1 N1
M2 N2
M2 N2
PBB
PBY
PYB
PYY
derived for a given position
48
24
Haley and Knott (1992)
Heredity 69: 315
F2 Cross between breeds
Identical to cross of inbreds
but follow B vs. Y alleles
Markers
MM
MM
MM
Mm
Mm
Mm
mm
mm
mm
NN
Nn
nn
NN
Nn
nn
NN
Nn
nn
Additive coef. Dom. Coef.
Xadd
Pr(BB) Pr(BY) Pr(YY) Pr(BB)-Pr(YY)
f(c1,c2,θ) f(c1,c2,θ) f(c1,c2,θ)
f(c1,c2,θ)
Yi = μ + baXadd,i + bdXdom,i + ei
E(ba) = a
Xdom
Pr(BY)
f(c1,c2,θ)
at QTL position
E(bd) = d
Fitted at each 1 cM position on chromosome
Position with highest F-test Æ QTL (if significant)
49
SSC1 MARBLING
Line-Cross
4.5
a = - 0.13
d = +0.19
-logP
4.0
3.5
1% Chr.w
3.0
2.5 5% Chr.w
2.0
Breed cross
X
F1
F2
1.5
1.0
0.5
Detect QTL that differ in
frequency
50
10 20 30 40 between
50 60 70 80 breeds
90 100 110 120 130
0.0
cM 0
25
Breed cross interval mapping
F0 2 Berkshire sires
BB
x
9 Yorkshire dams
YY
F1
8 sires
x
BY 26 dams
F2
525
BY
BB
BY
YB
YY
Compares average Berk allele to average York allele
Î QTL only detected if breeds differ in frequency
Berk
Frequency of Q
Line cross additive effect
pB
=
Line cross dominance effect =
X
York
pY
(pB-pY)a
(pB-pY)d
QTL effect
QQ +a
Qq
d
qq
-a
51
Summary of QTL mapping
in Line/ Breed Crosses
• QTL detection requires LD between markers and QTL
• Cross Æ extensive LD
Æ genome scan with markers @ 20 cM
• Regression interval mapping
Æ estimate QTL position, effect
• Estimates have limited accuracy
Æ 10 – 30 cM confidence intervals
• Fine mapping not limited by # markers but requires
• larger populations
• crosses that accumulate recombinations
• Recombinant Inbred Lines
• Advanced Intercross Lines
• Only detects QTL that differ between breeds
52
26
Breed
cross
QTL
scan
F0 2 Berkshire sires
BB
x
9 Yorkshire dams
YY
F1
8 sires
x
BY 26 dams
F2
525 BB
BY
BY
QTL that differ
Î in frequency
between breeds
Î Wide QTL region
(20-50 cM)
YB YY
Within-breed MAS requires QTL
that segregate within breeds
Follow-up within-breed research in QTL region:
Î Linkage mapping
Evans et al.
Æ see next
(2003 Genetics:621)
Î LD mapping
- confirmed QTL in 10 commercial lines
Æ day 3
53
r2
1
c=.001
Overview of Strategies for QTL mapping
0.9
0.8
c=.01
Outbred population
Line/Breed cross
0.7
0.6
0.5
Linkage analysis
LD markers
Linkage analysis
LE markers
F2 / BC
families
c=.05
0.4
0.3
c=.1
0.2
c=.2
0.1
c=.5
0
0
5
10
15
Generation
HS/FS
AIL
20
25
LD used
Population wide
Recomb.
1 rnd
>1 rnd
LD extent
Long
Smaller
Marker map Sparse
Coverage
Map resol.
Ext.
pedigree
LD mapping
LE markers
Cand.
genes
High
density
Denser
Genome wide
Poor
Better
54
5
27
7. QTL detection in outbred
populations – linkage analysis
e.g. livestock, wildlife, human
Reading
Dekkers and van der Werf (2007) Chapter 10 at
http://www.fao.org/docrep/010/a1120e/a1120e00.htm
55
LD always exists within families
r2
LD behavior similar to BC/F2
Sire
c=.001
0.8
c=.01
0.7
r
M
1
0.9
0.6
Q
0.5
c=.05
0.4
0.3
M
Progeny
m
m
q
M Q
c=.2
0.1
c=.5
0
0
meiosis
M Q M Q
M q
M q M
M QQ
M Q
M MQ q M Q
M q M Q
M Q
c=.1
0.2
HS
m q
m q
m q
5
10
15
20
Generation
25
m q
m q
m q
m Q
m Q
m Q
m q
m q
m q
Î Marker - QTL LD among progeny at large distance
56
28
QTL mapping in half-sib family design
Within-family LD not consistent across families
Sire 1
Sire 2
Sire 3
Sire 4
M
Q
M
q
M
Q
M
q
m
q
m
Q
m
Q
m
q
Î Analysis must allow for
different marker-QTL linkage phases within each family
QTL effects must be fitted w/in family:
Yij = μi + αQ,i PQ,ij + eij
PQ,ij = Prob(QMi | marker genotype, QTL position)
αQ,i = QTL allele substitution effect for sire i
See e.g. Knott et al. Theor.Appl.Genet. 1996. 93: 71-80
57
Power of alternative QTL mapping designs
For given number of animals genotyped
F2 > BC > Fullsib > Halfsib
Typical size used
animals
> 500 animals
>1000
Outbred designs: Fraction p2+q2 of parents are homozygous for QTL
= non-informative
58
29
Daughter design for QTL detection and MAS
Mm
M
m
m
m
m
m
m
m
m
m
M
M
M
M
M
M
M
M
Compare production
59
Grand daughter design
Mm
M
M
M
M
m
m
m
m
Compare
progeny
test
60
30
Grand-daughter Design
c
M
Grand
Sire
Q
?
?
?
?
X
m
q
M
Q μ+1/2α
M
Sons
q μ -1/2α
?
?
?
?
1/
2
(Weller et al. 1990)
(1-c)
1/
2
Genotyped for marker
c
m
q μ -1/2α
m
Q μ+1/2α
?
?
?
?
1/
2
(1-c)
1/
2
c
Mean phenotype of progeny for each son
(or son’s EBV or deregressed EBV)
μ +1/4 (1-2c)α
Average
μ - 1/4 (1-2c)α
Contrast of average EBV of sons mM?-mm? =
r2
1
1/
2 (1-2c)a
61
c=.001
Overview of Strategies for QTL mapping
0.9
0.8
c=.01
Outbred population
Line/Breed cross
0.7
0.6
0.5
Linkage analysis
LD markers
Linkage analysis
LE markers
F2 / BC
families
c=.05
0.4
0.3
c=.1
0.2
c=.2
0.1
c=.5
0
0
5
10
15
Generation
HS/FS
AIL
20
25
Ext.
pedigree
LD used
Population wide
Recomb.
1 rnd
>1 rnd
1 rnd
>1 rnd
LD extent
Long
Smaller
Long
Smaller
Denser
Sparse
Denser
Marker map Sparse
Coverage
Map resol.
Genome wide
Poor
Better
LD mapping
LD markers
Cand.
genes
High
density
Within family
Genome wide
Poor
Better
Linkage Analysis in extended pedigrees by random QTL effects - see later
62
31
8. Summary and limitations of QTL mapping
in outbred populations using sparse markers
• Within family Æ extensive LD
Æ genome scan with markers @ 20 cM
• Regression interval mapping
Æ estimate QTL position, effect
• Estimates of marker/QTL effects differ by family
Æ complicates MAS
• Estimates have limited accuracy
Æ 10 – 30 cM confidence intervals
• Fine mapping not limited by # markers but requires
• larger populations
• Populations that accumulate recombinations
• Linkage analysis in deep pedigrees
• Historical recombination Æ LD mapping
63
Software for QTL mapping
by linkage analysis
Many programs available (with tutorials)
See: http://linkage.rockefeller.edu/soft/list.html
• For inbred line crosses: Mapmaker QTL
http://www.broad.mit.edu/genome_software/other/qtl.html
http://darwin.eeb.uconn.edu/notes/qtl-mapmaker.pdf
• For breed crosses and outbred populations: QTL Express
http://qtl.cap.ed.ac.uk/
64
32
Day 2
QTL Detection
Objective
Present principles for detection of genes affecting
quantitative traits (QTL) using genetic markers
in ‘simple’ experimental designs
Concepts covered relevant to issues in ‘genomic selection’
1.
2.
3.
4.
5.
6.
Single locus quantitative genetic model
Principle of use of LD to detect QTL using markers
Overview of strategies for QTL detection
QTL detection using line crosses
QTL interval mapping in line crosses
QTL detection in line crosses – additional topics
a. Significance testing
b. Accuracy of position estimates
c. Breed crosses (vs inbred line crosses)
7. QTL detection in outbred populations – linkage analysis
8. Summary and limitations Æ need for LD mapping
65
9. Software for QTL mapping
66
33
Extra notes: Multiple QTL problem
What if there is more than 1 QTL linked to the marker?
c1
Q1 M
q1
Backcross:
c2
m
Q2
2 QTL
q2
E(YMm-Ymm ) = (1-2c1)(a1+d1) + (1-2c2)(a2+d2)
QTL 1
Î Marker picks up combined effect
of both QTL
16
14
12
Possible result from
fitting 1-QTL model:
Æ Ghost QTL
or no QTL
QTL 2
F
-v
alu
e
10
8
6
QTL 1
4
(if in coupling phase)
QTL 2
2
0
(if in repulsion phase)
0
10
20
30
40
Position (cM)
50
67
Solution – for inbred line crosses
Composite Interval Mapping (CIM)
Add markers as co-factors to control for QTL in other
intervals
A
B
C
D
E
F
Eg. When mapping a QTL in interval C-D,
include B and E as co-factors:
Yi = m + baXadd,i + bdXdom,i
+
Affected only
by QTL in B – E
Use to detect QTL in C-D interval
bBXB,i + bEXE,i + ei
Controls for QTL
outside B
outside E
In general – include markers just outside the interval as co-factors
Can include other (unlinked) QTL markers as co-factors to reduce residual var.
68
There’s no single perfect strategy on how to choose co-factors
34
69
Multiple QTL mapping in breed crosses
Comp.int.mapping not possible because
markers may not be completely informative
Alternative: Fit 2-QTL models:
Yi = m + ba1Xadd,1,i + bd1Xdom,1,i
+ ba2Xadd,2,i + bd2Xdom,2,i + ei
E.g. - fix QTL 1 at best position
- scan chromosome for best position of QTL 2
Test statistic is LRT = Likelihood ratio test
= -2ln[likelihood 2 QTL model / likelihood 1 QTL model]
~ Chi-square
See QTLExpress http://qtl.cap.ed.ac.uk/
70
35
Download