M2_Inbreeding - Crop and Soil Science

advertisement
PBG 650 Advanced Plant Breeding
Module 2: Inbreeding
•Genetic Diversity
–A few definitions
•Small Populations
–Random drift
–Changes in variance, genotypes
•Mating Systems
–Inbreeding coefficient from pedigrees
–Coefficient of coancestry
–Regular systems of inbreeding
Genetic Diversity Studies - Applications
• Highlight geographic areas for further germplasm collection
• Establish core collections
– preserve genetic resources
– representative samples for genetic studies
• Investigate theories regarding crop domestication and origin
• Determine genes involved in domestication
– Lower diversity in domesticated species than in wild relatives
• Selection of parents for a breeding program
– Identify untapped sources of genetic variation
• Establish effective breeding methods
• Define heterotic groups for inbred/hybrid development
Take PBG 620, 621, 622!
Measures of Genetic Diversity
• Number of alleles or SNPs identified
• Average number of alleles per locus
1
1
A 

• Effective alleles per locus = Ae
1  h p
• Major allele frequency = MAF
• Average expected heterozygosity (Nei’s genetic distance) =
e
2
i
He
• Observed heterozygosity = Ho
• Polymorphic Information Content = PIC values
• Polymorphism or % of polymorphic loci= Pj
– A locus is considered polymorphic if the frequency of the
major allele is less than 0.95 (or 0.99)
Average expected heterozygosity
•
Also called Nei’s Genetic Distance
– One locus (j), two alleles
h j  1  p2  q2  2pq
– One locus (j), with i alleles
h j  1   pi2
– Average across loci
•
He


L
h
j j
L
The average He across loci measures extent of
variation in a population
Steps in Diversity Analysis
1. Characterize the diversity
–
–
Genotyping
Phenotyping
2. Calculate relationships
–
Genetic distance
3. Express relationships with a classification and/or
ordination method
–
–
Classification or clustering
Ordination (e.g., PCA)
Recent Studies of Crop Diversity
Xu, X., et al., 2012. Resequencing 50 accessions of cultivated and wild rice yields markers for
identifying agronomically important genes. Nature Biotechnology 30: 105–111.
Population size
•
Sampling can lead to changes in gene frequency in small
populations
• Changes are random in direction (dispersive), but predictable
in amount
– random drift – accumulation of small changes due to sampling
over time
– differences among subgroups of the population increase over
time
– increase in uniformity and level of homozygosity within
subgroups (Wahlund effect)
• Two perspectives
– changes in variances due to sampling
– changes in genotype frequencies due to inbreeding
Falconer, Chapt. 3
Dispersive process - idealized population
Base population
N=
gametes
sub-populations
t=0
2N
2N
2N
2N
2N
2N
2N
N
N
N
N
N
N
N
2N
2N
2N
2N
2N
2N
2N
N
N
N
N
N
N
N
t=1
t=2
Idealized population assumptions
•
•
•
•
•
Mating occurs within sub-populations
Mating is at random (including self-fertilization)
Sub-populations are equal in size
Generations do not overlap
No mutation, migration or selection
 No change in the average gene frequency
among sub-populations over generations
q  q0
Random drift (genetic drift)
sampling process
Pr(k ) 
2N !
 pk 1  p 2N k 

k !(2N  k )! 
probability of obtaining k copies of
an allele with frequency p in the
next generation
• Every generation, the sampling
of gametes within each subpopulation centers around a new
allele frequency  changes
accumulate over time
•
Changes occur at a faster rate in
smaller populations
Random drift (genetic drift)
•
Gene frequencies in the subpopulations drift apart over time,
until all frequencies become
equally probable (steady state)
•
Once the steady state is
attained, the rate of fixation is
1/N in each generation
•
The longterm effect of drift for a
finite population is a loss of
genetic variation
• Historical effects of drift are
locked in (founder effect or
bottleneck effect)
eye color in Drosophila
105 populations, N=16
at t=0 f(bw)=f(bw75)=0.5
Buri, Peter. 1956. Gene frequency in small populations of
mutant Drosophila. Evolution 10:367-402.
Dispersive process – effects on variance
Variance in gene frequency
among sub-populations at t=1
 
2
q
2
q
p0 q0

2N
Variance among sub-populations increases in
each generation. At time t:
t

1 

2
 q  p0q0 1   1 


  2N  
 p0q0 at t = 
Change in genotype frequency
•
As gene frequencies become more dispersed
towards the extremes
– there is an increase in homozygosity and decrease in
heterozygosity within each sub-population
– genetic uniformity increases within sub-populations
Genotype
Frequency across
sub-populations
A1A1
p0  σ q
A1A2
2p0q0  2σ
A2A2
q0  σ q
2
2
2
2
2
q
Definition of inbreeding
inbreeding = mating of individuals that have common
ancestors
•
identical by descent (ibd) = alleles are direct
descendents from a common ancestral allele
(autozygous)
•
identical in state = alleles have the same nucleotide
sequence but descended from different ancestral
alleles (allozygous)
•
An individual is inbred if it has alleles that are
identical by descent
Coefficient of inbreeding
•
Probability that two alleles at any locus in an individual are
ibd (also applies to alleles sampled at random from the
population)
•
Must be in relation to a base population
Change in inbreeding
in a single generation
Inbreeding at generation t
1
ΔF 
2N
1 
1 
Ft 
 1 
Ft 1
2N  2N 
new
Recurrence equation
old
Ft  1  1  F
t
Inbreeding
Remember:
For a single generation
 
2
q
2
q
p0 q0

2N
 q2  p0q0 F
1
ΔF 
2N
At time t
Ft  1  1  F
t
t

1 

2
 q  p0q0 1   1 


  2N  
  p0q0Ft
2
q
Genotype frequencies with inbreeding
Genotype
Genotype
Frequency across
sub-populations
A1A1
p0  σ q
A1A2
2p0q0  2σ q
A2A2
q0  σ q
2
2
2
2
Frequency across
Showing origin
sub-populations
A1A2
p0  p0q0F
2p0q0  2p0q0F
A2A2
q0  p0q0F
A1A1
2
2
2
p0 1 F  p0F
2p0q0 1  F
2
q0 1 F  q0F
2
What will genotype frequencies be when the sub-populations are completely inbred?
Calculation of F from population data
Genotype
A1A1
A1A2
A2A2
Frequency
p0 1 F  p0F
2p0q0 1  F
2
q0 1 F  q0F
2
F can be viewed as the deficiency in observed
heterozygotes relative to expectation:
He  H 2pq  2pq 1  F
H

 1
F
He
2pq
He
H = observed frequency of heterozygotes
He = expected frequency of heterozygotes
F statistics – relative deficiency of heterozygotes
HI  HI   HS 



HT  HS   HT 
(1-FIT)=(1-FIS)(1-FST)
I = individual S = sub-population T = total
Base population
N=
t=0
2N
2N
2N
N
N
N
FST
FIT
FIS
1 2 3 4 5…..
Individuals in a subpopulation
Generation t
What population sizes are needed for breeding?
1.
Calculate the population size needed to have the
expectation of obtaining one ideal genotype
For a trait controlled by 10 unlinked loci:
(1/4)10 in an F2, so N = 410 = 1,048,576
(1/2)10 in an inbred line, so N = 210 = 1024
Standard error of q
2. Consider how to stabilize variance of allele frequencies
0.25
0.20
0.15
0.10
0.05
0.00
0
50
100
150
200
Population Size (N)
Bernardo, Chapt. 2
250
Would be more critical
for a long-term
recurrent selection
program than for a
particular F2 population
Effective population size
Number of individuals that would give rise to the
calculated sampling variance, or rate of inbreeding, if
the conditions of an idealized population were true
1
Ne 
2 F
1
F 
2Ne
Falconer, Chapt.4
Effective population size
• unequal numbers in successive generations
1 1 1
1
1
1
  

 ....  
Ne t  N1 N2 N3
Nt 
harmonic mean
– effects of a bottleneck persist over time
• different numbers of males and females
4NmNf
Ne 
Nm  Nf
Falconer, Chapt.4
Half-sib recurrent selection in meadowfoam
Year 1 – create half-sib families
500 spaced plants in nursery
outcrosshalf-sibs families
selfS1 families
Year 2 – evaluate families in
replicated trials
Year 3 Should I go back to
remnant half-sib seed of
selected families or use
the selfed seed for
recombination?
Migration
•
How many new introductions do I need in my
breeding program to counteract the loss of genetic
diversity due to inbreeding (genetic drift)?
1
FST 
4Nem  1
m is the migration rate (frequency)
Nem is the number of individuals introduced each generation
 A few new introductions each generation can have
a large impact on diversity in a breeding population
Inbreeding coefficients from pedigrees
A
a1a2
B
FX  
C
x
  1  FA 
1 n
2
n = number of individuals in path
including common ancestor
X
AB
AC
BX
CX
Prob.
a1
a1
a1
a1
(½)4
a2
a2
a2
a2
(½)4
a1
a2
a1
a2
(½)4
a2
a1
a2
a1
(½)4
FX=2*(½)4+2*(½)4*FA
=(½)3+(½)3FA= (½)3(1+FA)
Falconer Chapt. 5; Lynch and Walsh pgs 131-141
Inbreeding coefficients from pedigrees
A
B
C
D
E
G
H
J
Paths of
Relationship
n
F of
common
ancestor
EBACH
5
0
(1/2)
EBADGH
6
0
(1/2)
EBCH
4
0
(1/2)
ECADGH
6
0
(1/2)
ECBADGH
7
0
(1/2)
ECH
3
1/4
Contribution
to FJ
5
6
4
6
7
3
(1/2) *(1+0.25)
FJ=
0.2891
• E is inbred but this does not contribute to FJ
• No individual can appear twice in the same path
• Path must represent potential for gene
transmission (BCA is not valid, for example)
Coefficient of coancestry
identical by descent (ibd) = alleles descended from a
common ancestral allele
A
x
B
C
FC
inbreeding coefficient = probability that alleles in C are ibd
θ AB
coefficient of coancestry
• probability that alleles in A are ibd with alleles in B
• aka coefficient of kinship, parentage or consanguinity
FC  θ AB
Note: AB = fAB in Bernardo’s text
Coefficient of coancestry
A
x
B
C
θ AB • alleles received by A and B
• alleles sampled from A and B (to go to offspring)
FC
• alleles received by C
θcc • alleles sampled from C (to go to offspring)
Formal calculation of coancestry
A
a1a2
x
a
B
b1b2
b
C
c1c2
θ AB
FC  θ AB  P(a  a1, b  b1, a1  b1)
 P(a  a1, b  b2 , a1  b2 )
 P(a  a2 , b  b1, a2  b1)
 P(a  a2 , b  b2 , a2  b2 )
1
 P(a1  b1)  P(a1  b 2 )  P(a 2  b1)  P(a2  b 2 )
4
Rules of coancestry
AxB
CxD
θEC 
x
E
θEG  1 θAC  θAD  θBC  θBD 
4
G
θEG 
H
θHH 
1
2
1
2
1
2
θ AC  θBC 
θEC  θED 
1  FH 
Coancestry: selfing
A
a1a2
A
a1a2
x
X
¼ a1a1
½ a1a2
¼ a2a2
θ AA  FX 
θ XX 
1
2
1
2

1FX 
1
2
FA

1
2
1FA 
Derivation of the rules: another example
AxB
CxD
x
E
G
H
θEC 
1
8
Alleles from E
Alleles from C
1/4
a1
1/2
c1
AC
1/4
a1
1/2
c2
AC
1/4
a2
1/2
c1
AC
1/4
a2
1/2
c2
AC
1/4
b1
1/2
c1
BC
1/4
b1
1/2
c2
BC
1/4
b2
1/2
c1
BC
1/4
b2
1/2
c2
BC
4θ AC  4θBC   θ AC  θBC 
1
2
Coancestry of full sibs
A
x
B
AxB
AxB
x
C
D
C
D
E
θ CD 


1
4
1
4
E
θ AA  θ AB  θBA  θBB 
with no prior inbreeding
θ AA  2θ AB  θBB 
11

42
θCD
1  FA   2θ AB  1 1  FB 
2

11
1 1
   
42 2 4
Note: could get same result
by calculating FE
Tabular method for calculating coancestries
A
B
C
D
θCG   EGθCE   F GθCF 
E
F
G
contribution of E to G = 0.5
Excel
•
•
Can accommodate different levels of inbreeding in parents
•
Can be automated
Can incorporate information from molecular markers about the
contribution of parents to offspring (may vary from 0.5 due to
segregation during inbreeding)
Regular systems of inbreeding
•
•
Same mating system applied each generation
•
•
Purpose is to achieve rapid inbreeding
All individuals in each generation have the same
level of inbreeding
Develop recurrence equations to predict changes
over time
A
Example: repeated selfing
FB  θ AA 
B
1
2
1  FA 
Ft 
1
2
1  Ft -1
Regular systems of inbreeding
A
D
B
C
E
G
H
J
No prior
inbreeding
Recurrence
equation
Mating system
Coancestry
full sibs
EG=(1/4)(2BC+BB+CC)
1/4
Ft=(1/4)(1+2Ft-1+Ft-2)
half sibs
DE=(1/4)(AB+AC+BB+BC)
1/8
Ft=(1/8)(1+6Ft-1+Ft-2)
parent-offspring
AD=(1/2)(AAAB)
1/4
Ft=(1/2)(1+Ft-2)
backcrossing
FH=BD=(1/2)(BB+AB)
1/4
Ft=(1/4)(1+FB+2Ft-1)
selfing
BB=(1/2)(1+FB)
1/2
Ft=(1/2)(1+Ft-1)
Download