Supplemental data:

Supplemental data: An S-PLUS (2000) function, called simulSEL, was developed to reproduce the steps of a typical breeding population. The principle of the simulations is as follows: (Figure 3) Figure 3 here Marker-genotypes construction: Before the start of the breeding process, we considered only a few founder lines, with full linkage disequilibrium across all their genome, hence between all the markers and the QTLs. We also imposed that at this generation, no founder lines had any alleles in common with any other. Thus, line 1 carried only markers and QTLs coded 1, line 2 only markers and QTLs coded 2, and the last founder line (the NPth) carried only markers and QTLs coded NP all along its genome. Markers were evenly spaced every d centiMorgans along the chromosomes. On chromosome 1, we simulated a single QTL between two markers. In addition to the QTL of interest, we simulated on the other chromosomes Npoly randomly located QTLs (that might therefore be linked or not) with random effects to simulate the polygenic contribution to the trait values. First generation of crosses: From the founder lines generation, circular crosses, i.e. 1*2, 2*3, … , NP-1*NP, NP*1 were performed. We then derived lines by self-pollination to obtain new fixed lines (the initial crosses followed by numerous self-pollinations to go back to fixed lines is called a breeding cycle). NP mixed sub-populations of the same size (same contribution for all founders for this first generation) were derived, giving the “G0” generation. Quantitative trait: The parameters for the creation of the quantitative trait are the QTL heritability (h²QTL) and the heritability of the polygenes (h²poly). The heritability of the QTL is represented by the variance of the QTL divided by the total variance while the heritability of the polygenes are represented by the variance of the whole set of polygenes divided by the total variance. We created at the founder line generation the NP allele effects for the QTL and the effects of the Npoly polygenes. The NP possible effects of the QTL (for example, 20 founder lines means also 20 different alleles at the QTL and thus 20 different effects) were drawn from a normal distribution with mean 0 and variance 1. Then the QTL variance (VarQTL) was calculated at the true QTL position (according to the QTL allele information and the corresponding effect), and the NP effects for each of the Npoly polygenes were extracted from a normal distribution with mean 0 and variance [VarQTL*(h²poly/h²QTL)]/Npoly]. Finally, the true variance accounted for by the polygenes was computed (VarPoly), and a random normally distributed noise with variance  e2 = [VarQTL *(1/h²QTL-1) - VarPoly ] was added to simulate phenotypic values of the trait. Overlapping generations and matrix of crosses: After the genotype and phenotype of the lines of the “G0” generation were obtained, virtual breeding programs were conducted. These consisted in choosing a certain number of parents, in crossing them according to a “matrix of crosses”, then in deriving a certain number of progenies by self-pollination from each twoparents cross. Thus, each fixed progeny had only two fixed parents, and could share with other individuals of this generation a half-sib or full-sib relationship, or none. Moreover, two particularities common to most of the plant breeding schemes were implemented: - the “overlapping” choice of the parents. All parents were not necessarily extracted from the last generation only, but a proportion of them (parameter) could originate from older ones - the influence of a matrix of crosses on the structure of the resulting progeny of a breeding program, which influenced the effective population size The design of crosses at the beginning of a breeding program could be seen as a geometric series, since the representation of parents in the selected progeny is uneven, L-shaped rather than random. For example, if a given line, say X, is recognized to be the most elite line at a given period (with the best agronomic performance in a range of environments), X will typically be crossed to many other lines to fully exploit its genetic value. After self-pollination and selection, a certain number of lines coming from this parent X will still remain at the end of the breeding cycle, and will form one of the largest half-sib families (containing possibly some full-sibs when a specific cross is particularly outstanding). In contrast, a non-elite plant with a very specific trait of interest but with low agronomic performance may also be used to initiate crosses, but on a smaller scale. Some of its offspring might also be selected but to a much smaller extent. A matrix of crosses was implemented to reproduce the formation of half-sib and full-sib families during a cycle of breeding, hence taking into account unbalanced contributions of parents to the final population. The parents were sorted according to their phenotype, i.e. for their “breeding interest”, from the best to the worse in the first line and first column of the matrix of crosses. For example, for a breeding scenario with crosses between 100 parents, 100*99/2=4950 crosses are possible. If we want to create only 500 progenies, we have to choose to make only some crosses. As we want to obtain certain relationships through the crosses, we thus give a rank to the parents for their interest, and make more crosses for the interesting parents. The number of progenies derived from the cross between parent i and parent j varied from 0 (for 0 progenies, the most common situation) to 1, 2, 3 or 4 (1 individual was obtained in advanced breeding generations for the most common situation). The “overlapping” option extracted 80% of parents from the most recent generation of breeding, and 20% from all the older generations (accounting for 10% of the resulting progeny). Performing n breeding cycles: A loop over successive breeding cycles was performed and the parents of crosses, the resulting progenies and the phenotypic data were stored. All available generations were used to build the next cycle. The last breeding cycle, NG, was used for QTL detection. It should be noted that at the beginning, all the allele frequencies were equal, which was not the case after many generations due to genetic drift and non-panmictic conditions, and/or selection. NP alleles with different effects were possible at each QTL locus and at each marker at “G0” but their number was also reduced after NG cycles of breeding. Finally, all the markers and QTL were in full linkage disequilibrium at G0 but were not so after many recombinations (5 generations of self-pollination by cycle, and NG cycles before the mapping generation). The simulSEL function is freely available upon request from the authors.

Supplemental data:

Related documents

Products

Support

Supplemental data:

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib