click here and type title

advertisement
International Biometric Society
SELECTION OF SUGARCANE FAMILIES USING SYNTHETIC DATA AND ARTIFICIAL NEURAL NETWORK
Luiz A. Peternelli1,2, Ethel F.O. Peternelli1, Édimo F.A. Moreira1, Moysés Nascimento1
1
Department of Statistics, Federal University of Vicosa, Vicosa, MG, Brazil.
2
Sugarcane breeding program, RIDESA-UFV, Brazil.
Sugarcane is recognized worldwide as an important alternative source of fuel. In addition to
increased demand for sugar, the higher demand for ethanol leads to a great need to
increase the area planted with sugarcane, or smarter and environmentally friendly way,
increasing their productivity. The programs of genetic improvement of this crop worldwide
have invested in ways to speed up the identification, selection, and therefore the release of
new varieties increasingly productive. In this sense, a large number of genotypes should be
evaluated in field experiments. The selection process preferably occurs in two stages: first
the breeder identifies best families (usually half-sib or full-sib families) in the experiment.
Subsequently, the breeder seeks, within these top families, promising individuals or clones,
which are passed on to the next stages of breeding programs. These clones will be
evaluated in controlled experiments. For families to be evaluated with respect to yield, it is
necessary to weigh the plot corresponding to each family. If the family has a higher average
than the overall mean of the experiment, it is considered a promising family. Alternatively,
the yield may be obtained as a function of other characters called yield components, which
are easily collected on the plot level: number of stalks (NS), average diameter of stalks (DS)
and average height of stalks (HS). This study aimed to evaluate the use of synthetic data
and artificial neural networks (ANN) as a way to facilitate the identification of the best
families without weighing all the material in the field. We evaluated real data from five family
selection experiments (totaling 110 families), provided by the sugarcane breeding program
at the Federal University of Viçosa, MG - Brazil. From each plot we collected the mass of
stalks (MS, lately converted to tons of stalks per hectare, TSH), and the yield components
NS, DS and HS. From the actual TSH we classified each family the "selected" or "not
selected". NS, DS and HS were used as the input variables. A single hidden layer backpropagation ANN was considered in the analyses. We Evaluated two main scenarios: (i)
data from one experiment was used as a training population while the rest was used as a
testing population, and (ii) the same as i, but with the training population augmented by
synthetic data obtained by simulation from the mean vector and covariance matrix
corresponding to variables TSH, NS, DS and HS of the original training set. Other derived
sub-scenarios were also considered. Comparisons were based on the percentage of
misclassification obtained from each analysis. Results shown that selection of families
based only on NS, DS and HS under an ANN approach after considering synthetic data
seems to be potentiality useful and could be used as an alternative to ease family selection
procedures under field conditions, mainly if a large number of families are to be considered.
(FAPEMIG, CAPES, CNPq)
International Biometric Conference, Florence, ITALY, 6 – 11 July 2014
Download