GS_exercise - Purdue Genomics Wiki

advertisement
Genomic-selection exercise for the Evolutionary Quantitative Genetics
Purdue University
Genomic selection using Ridge Regression – BLUP
Exercise taken from Course Notes ‘QTL Mapping, MAS, and Genomic Selection’, March 10-14, 2008, Dr. Ben Hayes,
Animal Genetics and Genomics, Department of Primary Industries Research Victoria Attwood (Melbourne), Australia
as presented by Andres Gordillo (AgReliant Genetics) in Purdue’s Fall 2011 Advanced Plant Breeding course.
The dataset for this exercise consists of two populations. The first population, termed the
training population, consists of 325 bulls with daughter yield deviations (DYDs) for milk
protein percent. This is a highly heritable trait which is correlated with genotype. Each bull
was genotyped at 10 SNPs associated with the trait. Genotypes and phenotypes for the training
population are located in the file “data_DYD.csv.”
The second population, termed the test population, consists of genotypic data from 31male
calves. The data for this population is also located in the file “data_DYD.csv.” The best calves
will be selected for further breeding.
1. Based on the 10 SNPs, predict the best 5 calves from this population using GEBV from
ridge regression BLUPs (genomic selection).
Several years later, mean DYD for milk protein percent was collected from the progeny of the
31 calves in the test population. This data can be found in the file “DYD_observed_values.csv.”
2. Plot the predicted vs. observed values for individuals in the training population. What is the
correlation between these two values?
3. Plot the predicted vs. observed values for individuals in the test population. What is the
correlation between these two values?
4. Why are there differences in predicted vs. observed correlations between the training and
test populations?
5. Was it beneficial to use genomic selection to predict the best bulls? Why or why not?
1
Download