Partitioning breeding values in animal models Gerrit J. Kistemaker The Canadian test day model (CTDM) has been used since February 1999. This model is a 12 trait model with milk, fat, protein and SCS in each of the first three lactations. Breeding values within each trait are described using the Wilmink curve. As a result this model has 36 genetic effect for each animal. These genetic effects are used to calculate lactation and combined breeding values for milk, fat, protein, SCS and persistency. Observations made in first lactation will affect breeding values in later lactations but the weight which is put on an animals own records relative to the parent average and/or progeny contributions have not been calculated. The purpose of this paper is to partition an animals breeding value with respect to the three sources of information (observations, parent average and progeny) which contribute to it and then calculate a relative weight on the data relative to the pedigree contributions. This paper uses the CTDM as an example but the derivation applies to all animal models. Partitioning breeding values The MME for the CTDM which is used at CDN can be written as: X' R 1 X X' R 1 Z 1 1 1 1 Z' R X Z' R Z A G W' R 1 X W' R 1 Z b X' R 1 y X' R 1 W 1 Z' R 1 W a Z' R y W' R 1 W I 1 P 1 p W' R 1 y In order to derive the partition of breeding values, animals are sorted by birth date. To partition the breeding values of animal i, divide all animals into three groups: 1: All animals older than animal i (this includes the parents of animal i) 2: Animal i 3: All animal younger than animal i These three groups of animals are used to partition Z, a, Z = [ Z1 Z2 Z3 ] a = [ a1 a2 a3 ] and A-1: A 1 A 11 A 21 A 31 A 12 A 22 A 32 A 13 A 23 A 33 Using this partitioning the MME can be written as: X' R 1X X' R 1Z1 1 1 11 1 Z1 ' R X Z1 ' R Z1 A G Z ' R 1X Z ' R 1Z A 21 G 1 2 1 2 Z 3 ' R 1X Z 3 ' R 1Z1 A 31 G 1 1 W' R 1Z1 W' R X X' R 1Z 2 Z1 ' R 1Z 2 A12 G 1 Z 2 ' R 1Z 2 A 22 G 1 1 Z3 ' R Z2 A G W' R 1Z 2 32 1 X' R 1Z 3 Z1 ' R 1Z 3 A13 G 1 Z 2 ' R 1Z 3 A 23 G 1 Z 3 ' R 1Z 3 A 33 G 1 W' R 1Z 3 b X' R 1y 1 a1 Z1 ' R y a Z ' R 1y Z 2 ' R 1W 2 2 1 a 3 Z3 ' R 1y Z3 ' R W W' R 1W I 1 P 1 p W' R 1y X' R 1W Z1 ' R 1W We want to partition a2 and therefore the part of interest is: Z 2'R 1 X Z 2 ' R 1 Z 1 A 21 G 1 Z 2 ' R 1 Z 2 A 22 G 1 b a 1 Z 2 ' R 1 W a 2 Z 2 ' R 1 y a 3 p Z 2 ' R 1 Z 3 A 23 G 1 The Z matrices relate each observation to one animal and therefore Z 2 ' R 1 Z1 and Z 2 ' R 1 Z 3 are null matrices which simplifies the equation to: Z ' R Xb A 1 21 2 This can be rewritten as: Z ' R 1 2 G 1 a 1 Z 2 ' R 1 Z 2 A 22 G 1 a 2 A 23 G 1 a 3 Z 2 ' R 1 W p Z 2 ' R 1 y Z 2 A 22 G 1 a 2 Z 2 ' R 1 y Xb Wp A 21 G 1 a1 A 23 G 1 a 3 The matrix in the left hand side is the diagonal for the animal we want to partition proofs on as it is used in the animal model. If we use: D Z 2 ' R 1 Z 2 A 22 G 1 Then we can calculate a2 as: a 2 D1 Z 2 ' R 1 y Xb Wp D1 A 21 G 1 a1 D1 A 23 G 1 a 3 In this calculation: D 1 Z 2 ' R 1 y Xb Wp A are the contributi ons from TD records are TD records adjusted for all non - genetic effects are the contributi ons from breeding values in a D 1 A 21 G 1 are the contributi ons from breeding values in a 1 D 1 23 G 1 3 The three contribution matrices are large but most values are 0. There are only non-zero values for TD records on animal i. A21 has only non zero values for the parents of animal i and A23 has only non zero values for the progeny and mates of animal i. If both parents and all mates are known then the sum of the elements in -A21 and -A23 is equal to A22, which means that the weight on parent average vs. progeny is the same as in a single trait animal model. If both parents and all mates are known and all animals have the same level of inbreeding then we get a ratio of 1 on PA and #progeny/4 on progeny contributions. The total weight on breeding values from other animals can be rewritten as D 1 A 22 G 1 (this would also account for mendelian sampling influences from unknown parents/mates). Pedigree contributions can be partitioned further based on the elements in -A21 and -A23. Calculating weights for animals with test day records In a single trait animal model the weights on the data and the weights on the pedigree contributions are proportional to the genetic standard deviation of the contribution from each source. Therefore, the weights on pedigree and data contributions in a multi trait model were calculated from the genetic variance of each source which contributes to the solution. The genetic (co)variance matrix of the data records for animal i is: Z2GZ2' The genetic (co)variance of the data contributions to each parameter is: G dat D -1 Z 2 ' R -1 Z 2 GZ 2 ' R -1 Z 2 D -1 The genetic (co)variance of the total pedigree contributions to each parameter is: G ped D-1 A 22G 1 G A 22G 1 D-1 These matrices can be pre and post multiplied with a weight vector to get the contributions to the variance for traits of interest. For example for first lactation milk yield in the CTDM use the weight vector: W = [ 305 46665 19.5 0 .... 0 ]' The standard deviations of the contributions are for: S dat W' G dat W Data contribution: Pedigree contribution: S ped W' G ped W The weight on data is: Wdat=Sdat/(Sdat+Sped) The weight on pedigree is: Wped=Sped/(Sdat+Sped) Calculating weights for progeny test day records For bulls the main interest is the weight which is put on the progeny test day records relative to all other sources which contribute to the bulls proof. This weight is affected by the weight on the progeny breeding values relative to the parent average and by the weight which the test day records received when the progeny breeding values were calculated. Both these weights were calculated in the previous section. We want to partition the contribution from the progeny breeding values to the breeding value of a specific bull into three parts, the contribution from the parents Wpa (bull and mate), the contribution from the daughters test day records Wdata (DATA), and contributions from any other source Wother (this would be the grand progeny of the bull). The progeny contribution to the bulls breeding value can be written as: 1 n 2 EBV prog EBV mate n i 1 1 n 2 Wpa * .5 * (EBV bull EBV mate ) Wdata * DATA (1 Wpa Wdata ) * OTHER EBV mate n i 1 1 n Wpa * EBV bull (Wpa 1) * EBV mate Wdata * 2 * DATA (1 Wpa Wdata ) * 2 * OTHER n i 1 1 n Wpa * EBV bull Wdata * (2 * DATA EBV mate ) (1 Wpa Wdata ) * (2 * OTHER EBV mate ) n i 1 Wpa * EBV bull Wdata * (PD) (1 Wpa Wdata ) * (PO) Where PD (2 * DATA EBV mate ) PO (2 * OTHER EBV mate ) These contributions are multiplied by the weight on the progeny contributions (Wprog) in the bulls breeding value. The bulls breeding value can then be calculated as: EBV bull (1 Wprog ) * .5 * (EBV sire EBV dam ) Wprog * Wpa * EBV bull Wdata * PD (1 Wpa Wdata ) * PO) The weight on the test day records of progeny is: Wprogdata Wprog Wdata 1 Wprog Wpa Results I have included an Excel file with graphs for weights on Milk yield and for combined Milk, persistency and SCS and weights in the Lactation model. These graphs show the results for one cow which has no progeny and both parents are known and not inbred. This cows starts with only one TD record at 20 DIM and adds one TD records every 30 days, all TD records are from 24 hour recording with M, F, P and SCS observed. The weight which is put on the TD record(s) for estimating each breeding value was calculated after each TD record was added. When looking at the graph with weights on milk yield: When the cow has her first TD records at 20 DIM this TD record receives a weight of .25 when estimating 1st lactation proof, a weight of .13 when estimating 3rd lactation proofs and a weight of .19 when estimating combined proof. When the cow adds her second TD record at 50 DIM there is a total weight of .48 on these two test day records when estimating1st lactation yield, a weight of .38 when estimating combined yield and a weight of .26 when estimating 3rd lactation milk yield. When the cow has her first TD record in second lactation at 20 DIM she also has 10 TD records in first lactation. There is a total weight of .51 on TD records when estimating 2nd lactation milk yield, a weight of .43 when estimating 3rd lactation milk yield and a weight of .63 when estimating combined milk yield. This is a total weight on all TD records and weights on the 10 1st lactation records and the single 2nd lactation record are not calculated separately. Weights on the data for fat and protein yields show a similar trends as the weight on data for milk yield but are lower. Weights on the data are higher than in the lactation model which is a combination of the difference in models (test day records vs. lactation records) and a larger ratio of genetic to residual variance in the test day model. Weights on daughter records were calculated for a bull with both parents and all mates known and not inbred. Weights on daughter data records when calculating combined milk yield were estimated for some scenarios which were similar to those used by Liu et al. (Interbull bulletin no. 22, p81-87). Liu et al. used a different method to calculate weights and used the model and parameters as they were used in the German test day model which is not a random regression model. No. of daughters 50 100 100 100 No. of tests per lactation First 3 8 10 10 Second 0 0 10 10 Third 0 0 0 10 Weight on data when estimating combined milk yield .846 .938 .951 .958 Liu et. al results (8 tests for a complete lactation) Best Case Worst Case .56 .58 .59 .60 .72 .74 .95 .97 This table shows that the method described in this document estimates weights on the data which are much higher as those reported by Liu et al. I think this is due to the difference in the way the weights are calculated but differences in (co)variances and in the models used could also account for (part of) the differences.