EXCELMEAT Workshop Biosensing Pork Quality Lleida, 25 October 2012 Comparing methods for estimating gene effects: least squares vs. genomic selection. Hernández-Sánchez J1, Pong-Wong R2, Freyer G3, Vagenas D4 1 Institut de Recerca i Tecnologia Agroalimentàries (IRTA), Animal Breeding and Genetics, Spain. 2The Roslin Institute and The R(D)SVS, Genetics and Genomics, University of Edinburgh, UK. 3Leibniz Institute for Farm Animal Biology (FBN), Unit Genetics and Biometry, Germany. 4Institute of Health and Biomedical Innovation, Research Methods Group, Queensland University of Technology, Australia. Complex traits arise through direct gene effects, multiple interactions between genes and interactions between genes and the environment. Genome wide association studies (GWAS) test thousands of markers, one at a time, to locate those explaining phenotypic variation. Usually, markers are not the causal mutations but may be in linkage disequilibrium (LD) with them. If many causal mutations are in LD with a marker, GWAS may be biased. We call this model LSR1. The magnitude of that bias is 𝑎𝑖′ = ∑𝑀 𝑗=1 𝑎𝑗 𝐷𝑖𝑗 ⁄𝑝𝑖 𝑞𝑖 , where i and j denote loci, there are M loci, aj is the unbiased additive effect of locus j, pi=1-qi the allele frequency at locus i, and Dij the LD between loci i and j. The biased would disappear if all causal mutations were included in a single model simultaneously. We call this model LSRM. However, one does not know M. Moreover, LSRM cannot handle overparameterised models where M > sample size, and may suffer from collinearity problems. We explored alternatives such as 1) sampling a fixed number of random markers and selecting the best model via AIC (AICLSRM), 2) genomic selection (GS) methods that assume random marker effects and prior distributions with high density of null effects. Three GS models were tested: Ridge Regression, Bayes-C and Bayesian Lasso. We compared all models in terms of accuracy (bias) and precision (standard error) of estimated gene effects via simulations. As expected, LSR1 was the most precise but least accurate method. LSRM was the most accurate method. GS methods ranked between LSR1 and LSRM in terms of both accuracy and precision. AICLSRM rendered the most parsimonious model, in which 25% of markers explained most of the genetic variation. More realistic and complex situations are being investigated, for example testing SNPs not genes.