PBG 650 Advanced Plant Breeding Module 2: Inbreeding •Genetic Diversity –A few definitions •Small Populations –Random drift –Changes in variance, genotypes •Mating Systems –Inbreeding coefficient from pedigrees –Coefficient of coancestry –Regular systems of inbreeding Genetic Diversity Studies - Applications • Highlight geographic areas for further germplasm collection • Establish core collections – preserve genetic resources – representative samples for genetic studies • Investigate theories regarding crop domestication and origin • Determine genes involved in domestication – Lower diversity in domesticated species than in wild relatives • Selection of parents for a breeding program – Identify untapped sources of genetic variation • Establish effective breeding methods • Define heterotic groups for inbred/hybrid development Take PBG 620, 621, 622! Measures of Genetic Diversity • Number of alleles or SNPs identified • Average number of alleles per locus 1 1 A • Effective alleles per locus = Ae 1 h p • Major allele frequency = MAF • Average expected heterozygosity (Nei’s genetic distance) = e 2 i He • Observed heterozygosity = Ho • Polymorphic Information Content = PIC values • Polymorphism or % of polymorphic loci= Pj – A locus is considered polymorphic if the frequency of the major allele is less than 0.95 (or 0.99) Average expected heterozygosity • Also called Nei’s Genetic Distance – One locus (j), two alleles h j 1 p2 q2 2pq – One locus (j), with i alleles h j 1 pi2 – Average across loci • He L h j j L The average He across loci measures extent of variation in a population Steps in Diversity Analysis 1. Characterize the diversity – – Genotyping Phenotyping 2. Calculate relationships – Genetic distance 3. Express relationships with a classification and/or ordination method – – Classification or clustering Ordination (e.g., PCA) Recent Studies of Crop Diversity Xu, X., et al., 2012. Resequencing 50 accessions of cultivated and wild rice yields markers for identifying agronomically important genes. Nature Biotechnology 30: 105–111. Population size • Sampling can lead to changes in gene frequency in small populations • Changes are random in direction (dispersive), but predictable in amount – random drift – accumulation of small changes due to sampling over time – differences among subgroups of the population increase over time – increase in uniformity and level of homozygosity within subgroups (Wahlund effect) • Two perspectives – changes in variances due to sampling – changes in genotype frequencies due to inbreeding Falconer, Chapt. 3 Dispersive process - idealized population Base population N= gametes sub-populations t=0 2N 2N 2N 2N 2N 2N 2N N N N N N N N 2N 2N 2N 2N 2N 2N 2N N N N N N N N t=1 t=2 Idealized population assumptions • • • • • Mating occurs within sub-populations Mating is at random (including self-fertilization) Sub-populations are equal in size Generations do not overlap No mutation, migration or selection No change in the average gene frequency among sub-populations over generations q q0 Random drift (genetic drift) sampling process Pr(k ) 2N ! pk 1 p 2N k k !(2N k )! probability of obtaining k copies of an allele with frequency p in the next generation • Every generation, the sampling of gametes within each subpopulation centers around a new allele frequency changes accumulate over time • Changes occur at a faster rate in smaller populations Random drift (genetic drift) • Gene frequencies in the subpopulations drift apart over time, until all frequencies become equally probable (steady state) • Once the steady state is attained, the rate of fixation is 1/N in each generation • The longterm effect of drift for a finite population is a loss of genetic variation • Historical effects of drift are locked in (founder effect or bottleneck effect) eye color in Drosophila 105 populations, N=16 at t=0 f(bw)=f(bw75)=0.5 Buri, Peter. 1956. Gene frequency in small populations of mutant Drosophila. Evolution 10:367-402. Dispersive process – effects on variance Variance in gene frequency among sub-populations at t=1 2 q 2 q p0 q0 2N Variance among sub-populations increases in each generation. At time t: t 1 2 q p0q0 1 1 2N p0q0 at t = Change in genotype frequency • As gene frequencies become more dispersed towards the extremes – there is an increase in homozygosity and decrease in heterozygosity within each sub-population – genetic uniformity increases within sub-populations Genotype Frequency across sub-populations A1A1 p0 σ q A1A2 2p0q0 2σ A2A2 q0 σ q 2 2 2 2 2 q Definition of inbreeding inbreeding = mating of individuals that have common ancestors • identical by descent (ibd) = alleles are direct descendents from a common ancestral allele (autozygous) • identical in state = alleles have the same nucleotide sequence but descended from different ancestral alleles (allozygous) • An individual is inbred if it has alleles that are identical by descent Coefficient of inbreeding • Probability that two alleles at any locus in an individual are ibd (also applies to alleles sampled at random from the population) • Must be in relation to a base population Change in inbreeding in a single generation Inbreeding at generation t 1 ΔF 2N 1 1 Ft 1 Ft 1 2N 2N new Recurrence equation old Ft 1 1 F t Inbreeding Remember: For a single generation 2 q 2 q p0 q0 2N q2 p0q0 F 1 ΔF 2N At time t Ft 1 1 F t t 1 2 q p0q0 1 1 2N p0q0Ft 2 q Genotype frequencies with inbreeding Genotype Genotype Frequency across sub-populations A1A1 p0 σ q A1A2 2p0q0 2σ q A2A2 q0 σ q 2 2 2 2 Frequency across Showing origin sub-populations A1A2 p0 p0q0F 2p0q0 2p0q0F A2A2 q0 p0q0F A1A1 2 2 2 p0 1 F p0F 2p0q0 1 F 2 q0 1 F q0F 2 What will genotype frequencies be when the sub-populations are completely inbred? Calculation of F from population data Genotype A1A1 A1A2 A2A2 Frequency p0 1 F p0F 2p0q0 1 F 2 q0 1 F q0F 2 F can be viewed as the deficiency in observed heterozygotes relative to expectation: He H 2pq 2pq 1 F H 1 F He 2pq He H = observed frequency of heterozygotes He = expected frequency of heterozygotes F statistics – relative deficiency of heterozygotes HI HI HS HT HS HT (1-FIT)=(1-FIS)(1-FST) I = individual S = sub-population T = total Base population N= t=0 2N 2N 2N N N N FST FIT FIS 1 2 3 4 5….. Individuals in a subpopulation Generation t What population sizes are needed for breeding? 1. Calculate the population size needed to have the expectation of obtaining one ideal genotype For a trait controlled by 10 unlinked loci: (1/4)10 in an F2, so N = 410 = 1,048,576 (1/2)10 in an inbred line, so N = 210 = 1024 Standard error of q 2. Consider how to stabilize variance of allele frequencies 0.25 0.20 0.15 0.10 0.05 0.00 0 50 100 150 200 Population Size (N) Bernardo, Chapt. 2 250 Would be more critical for a long-term recurrent selection program than for a particular F2 population Effective population size Number of individuals that would give rise to the calculated sampling variance, or rate of inbreeding, if the conditions of an idealized population were true 1 Ne 2 F 1 F 2Ne Falconer, Chapt.4 Effective population size • unequal numbers in successive generations 1 1 1 1 1 1 .... Ne t N1 N2 N3 Nt harmonic mean – effects of a bottleneck persist over time • different numbers of males and females 4NmNf Ne Nm Nf Falconer, Chapt.4 Half-sib recurrent selection in meadowfoam Year 1 – create half-sib families 500 spaced plants in nursery outcrosshalf-sibs families selfS1 families Year 2 – evaluate families in replicated trials Year 3 Should I go back to remnant half-sib seed of selected families or use the selfed seed for recombination? Migration • How many new introductions do I need in my breeding program to counteract the loss of genetic diversity due to inbreeding (genetic drift)? 1 FST 4Nem 1 m is the migration rate (frequency) Nem is the number of individuals introduced each generation A few new introductions each generation can have a large impact on diversity in a breeding population Inbreeding coefficients from pedigrees A a1a2 B FX C x 1 FA 1 n 2 n = number of individuals in path including common ancestor X AB AC BX CX Prob. a1 a1 a1 a1 (½)4 a2 a2 a2 a2 (½)4 a1 a2 a1 a2 (½)4 a2 a1 a2 a1 (½)4 FX=2*(½)4+2*(½)4*FA =(½)3+(½)3FA= (½)3(1+FA) Falconer Chapt. 5; Lynch and Walsh pgs 131-141 Inbreeding coefficients from pedigrees A B C D E G H J Paths of Relationship n F of common ancestor EBACH 5 0 (1/2) EBADGH 6 0 (1/2) EBCH 4 0 (1/2) ECADGH 6 0 (1/2) ECBADGH 7 0 (1/2) ECH 3 1/4 Contribution to FJ 5 6 4 6 7 3 (1/2) *(1+0.25) FJ= 0.2891 • E is inbred but this does not contribute to FJ • No individual can appear twice in the same path • Path must represent potential for gene transmission (BCA is not valid, for example) Coefficient of coancestry identical by descent (ibd) = alleles descended from a common ancestral allele A x B C FC inbreeding coefficient = probability that alleles in C are ibd θ AB coefficient of coancestry • probability that alleles in A are ibd with alleles in B • aka coefficient of kinship, parentage or consanguinity FC θ AB Note: AB = fAB in Bernardo’s text Coefficient of coancestry A x B C θ AB • alleles received by A and B • alleles sampled from A and B (to go to offspring) FC • alleles received by C θcc • alleles sampled from C (to go to offspring) Formal calculation of coancestry A a1a2 x a B b1b2 b C c1c2 θ AB FC θ AB P(a a1, b b1, a1 b1) P(a a1, b b2 , a1 b2 ) P(a a2 , b b1, a2 b1) P(a a2 , b b2 , a2 b2 ) 1 P(a1 b1) P(a1 b 2 ) P(a 2 b1) P(a2 b 2 ) 4 Rules of coancestry AxB CxD θEC x E θEG 1 θAC θAD θBC θBD 4 G θEG H θHH 1 2 1 2 1 2 θ AC θBC θEC θED 1 FH Coancestry: selfing A a1a2 A a1a2 x X ¼ a1a1 ½ a1a2 ¼ a2a2 θ AA FX θ XX 1 2 1 2 1FX 1 2 FA 1 2 1FA Derivation of the rules: another example AxB CxD x E G H θEC 1 8 Alleles from E Alleles from C 1/4 a1 1/2 c1 AC 1/4 a1 1/2 c2 AC 1/4 a2 1/2 c1 AC 1/4 a2 1/2 c2 AC 1/4 b1 1/2 c1 BC 1/4 b1 1/2 c2 BC 1/4 b2 1/2 c1 BC 1/4 b2 1/2 c2 BC 4θ AC 4θBC θ AC θBC 1 2 Coancestry of full sibs A x B AxB AxB x C D C D E θ CD 1 4 1 4 E θ AA θ AB θBA θBB with no prior inbreeding θ AA 2θ AB θBB 11 42 θCD 1 FA 2θ AB 1 1 FB 2 11 1 1 42 2 4 Note: could get same result by calculating FE Tabular method for calculating coancestries A B C D θCG EGθCE F GθCF E F G contribution of E to G = 0.5 Excel • • Can accommodate different levels of inbreeding in parents • Can be automated Can incorporate information from molecular markers about the contribution of parents to offspring (may vary from 0.5 due to segregation during inbreeding) Regular systems of inbreeding • • Same mating system applied each generation • • Purpose is to achieve rapid inbreeding All individuals in each generation have the same level of inbreeding Develop recurrence equations to predict changes over time A Example: repeated selfing FB θ AA B 1 2 1 FA Ft 1 2 1 Ft -1 Regular systems of inbreeding A D B C E G H J No prior inbreeding Recurrence equation Mating system Coancestry full sibs EG=(1/4)(2BC+BB+CC) 1/4 Ft=(1/4)(1+2Ft-1+Ft-2) half sibs DE=(1/4)(AB+AC+BB+BC) 1/8 Ft=(1/8)(1+6Ft-1+Ft-2) parent-offspring AD=(1/2)(AAAB) 1/4 Ft=(1/2)(1+Ft-2) backcrossing FH=BD=(1/2)(BB+AB) 1/4 Ft=(1/4)(1+FB+2Ft-1) selfing BB=(1/2)(1+FB) 1/2 Ft=(1/2)(1+Ft-1)