45734-2-12118

advertisement
International Biometric Society
APPLICATIONS OF (GENOMIC) RELATIONSHIPS TO LIVESTOCK GENETICS
Andres Legarra
andres.legarra@toulouse.inra.fr, INRA, UMR 1388 GenPhySe, CS 52627, 31326 Castanet
Tolosan, France
Two of most important Livestock Genetics tasks are the localisation of causal genes, and
the genetic evaluation (for selection: choosing best parents of future animals so that future
generations will be more productive, healthy, and profitable). Traits studied are of complex,
polygenic determinism, such as fertility, resistance to disease, or growth speed. Livestock
quantitative geneticists’ current paradigm is partly parametric, partly empirical: theory is
parametrical but checked with real data studies. Multivariate normality is commonly
assumed. Also (as shown by Fisher), additivity greatly holds because individual alleles, and
not genotypes, are passed from parent to offspring, so that the effect of an animal genetic
background on its progeny performance is additive, even if true gene action is highly
complex and epistatic by nature. These assumptions are not only handy but also quite
accurate, as reflected by cross-validation studies in many data sets.
Genetic values of individuals are modelled as random variables, whose covariance depends
on true allele sharing at genes of interest among individuals. Scaled covariances among
individuals are called relationships. However these genes (or QTLs) are unknown, and we
use proxies of these genes. 20-th century technologies used pedigrees. This implicitly
assumes infinite unlinked genes of infinitely small effects, and also that founders of the
pedigree are “unrelated”, which has a loose biological sense and results in a covariance of
0. However, genes are transmitted in DNA chunks, and these are linked and finite (although
possibly very many). Therefore a finer estimate of relationships uses DNA information, now
(2014) in the form of SNP genetic markers, simple bits of DNA code with no particular
meaning. With these SNP, many different “genomic” relationship matrices can be
constructed, the most popular ones being based on quadratic forms of the type ZDZ’. These
can be shown to be method-of-moments estimators of “true” relationships but can also be
seen as if each marker had an unknown, random effect. Genomic relationship matrices can
be extended to non-genotyped individuals through the use of so-called single-step methods.
Relationships are used in several ways. The two most relevant ones are the genetic
evaluation: estimation (or prediction) of genetic values of individuals (so as to pick the best)
and the association analysis: focused in finding genes but considering the structure of the
data. In both cases, a linear model is constructed under the assumption of multivariate
normality.
Genetic evaluation includes phenotypic data as response variable; and environmental
effects such as herd or age and (millions of) genetic values as unknowns, with a priori
covariance matrices based on pedigree or genomic relationships. Estimates of unknowns
are obtained solving the normal equations, usually by iterative techniques. Genetic
evaluations based on markers (or genomic evaluations or GBLUP in this specific model) are
more accurate than those based on pedigree, and thus of large economic interest.
Association analysis with relationships proceeds by testing the effect of a marker of interest
in a linear model (Genome Wide Association AnalysiS) that includes the marker + genetic
values of individuals, and these genetic values are modelled with covariances based on
pedigree or, even better, genomic relationships. This protects from spurious localisations
based on hidden relationship or structure. Principal component analysis is a simplification of
this procedure, which assumes that only a few large hidden structures exist in the data.
International Biometric Conference, Florence, ITALY, 6 – 11 July 2014
Download