International Biometric Society APPLICATIONS OF (GENOMIC) RELATIONSHIPS TO LIVESTOCK GENETICS Andres Legarra andres.legarra@toulouse.inra.fr, INRA, UMR 1388 GenPhySe, CS 52627, 31326 Castanet Tolosan, France Two of most important Livestock Genetics tasks are the localisation of causal genes, and the genetic evaluation (for selection: choosing best parents of future animals so that future generations will be more productive, healthy, and profitable). Traits studied are of complex, polygenic determinism, such as fertility, resistance to disease, or growth speed. Livestock quantitative geneticists’ current paradigm is partly parametric, partly empirical: theory is parametrical but checked with real data studies. Multivariate normality is commonly assumed. Also (as shown by Fisher), additivity greatly holds because individual alleles, and not genotypes, are passed from parent to offspring, so that the effect of an animal genetic background on its progeny performance is additive, even if true gene action is highly complex and epistatic by nature. These assumptions are not only handy but also quite accurate, as reflected by cross-validation studies in many data sets. Genetic values of individuals are modelled as random variables, whose covariance depends on true allele sharing at genes of interest among individuals. Scaled covariances among individuals are called relationships. However these genes (or QTLs) are unknown, and we use proxies of these genes. 20-th century technologies used pedigrees. This implicitly assumes infinite unlinked genes of infinitely small effects, and also that founders of the pedigree are “unrelated”, which has a loose biological sense and results in a covariance of 0. However, genes are transmitted in DNA chunks, and these are linked and finite (although possibly very many). Therefore a finer estimate of relationships uses DNA information, now (2014) in the form of SNP genetic markers, simple bits of DNA code with no particular meaning. With these SNP, many different “genomic” relationship matrices can be constructed, the most popular ones being based on quadratic forms of the type ZDZ’. These can be shown to be method-of-moments estimators of “true” relationships but can also be seen as if each marker had an unknown, random effect. Genomic relationship matrices can be extended to non-genotyped individuals through the use of so-called single-step methods. Relationships are used in several ways. The two most relevant ones are the genetic evaluation: estimation (or prediction) of genetic values of individuals (so as to pick the best) and the association analysis: focused in finding genes but considering the structure of the data. In both cases, a linear model is constructed under the assumption of multivariate normality. Genetic evaluation includes phenotypic data as response variable; and environmental effects such as herd or age and (millions of) genetic values as unknowns, with a priori covariance matrices based on pedigree or genomic relationships. Estimates of unknowns are obtained solving the normal equations, usually by iterative techniques. Genetic evaluations based on markers (or genomic evaluations or GBLUP in this specific model) are more accurate than those based on pedigree, and thus of large economic interest. Association analysis with relationships proceeds by testing the effect of a marker of interest in a linear model (Genome Wide Association AnalysiS) that includes the marker + genetic values of individuals, and these genetic values are modelled with covariances based on pedigree or, even better, genomic relationships. This protects from spurious localisations based on hidden relationship or structure. Principal component analysis is a simplification of this procedure, which assumes that only a few large hidden structures exist in the data. International Biometric Conference, Florence, ITALY, 6 – 11 July 2014