Partitioning_proofs_Paper

advertisement
Partitioning breeding values in animal models
Gerrit J. Kistemaker
The Canadian test day model (CTDM) has been used since February 1999. This model is a 12
trait model with milk, fat, protein and SCS in each of the first three lactations. Breeding values
within each trait are described using the Wilmink curve. As a result this model has 36 genetic
effect for each animal. These genetic effects are used to calculate lactation and combined
breeding values for milk, fat, protein, SCS and persistency. Observations made in first lactation
will affect breeding values in later lactations but the weight which is put on an animals own
records relative to the parent average and/or progeny contributions have not been calculated. The
purpose of this paper is to partition an animals breeding value with respect to the three sources of
information (observations, parent average and progeny) which contribute to it and then calculate
a relative weight on the data relative to the pedigree contributions. This paper uses the CTDM as
an example but the derivation applies to all animal models.
Partitioning breeding values
The MME for the CTDM which is used at CDN can be written as:
 X' R 1 X
X' R 1 Z

1
1
1
1
 Z' R X Z' R Z  A  G
 W' R 1 X
W' R 1 Z

 b   X' R 1 y 
X' R 1 W
  
1 
Z' R 1 W
 a    Z' R y 
W' R 1 W  I 1  P 1  p   W' R 1 y 
In order to derive the partition of breeding values, animals are sorted by birth date. To partition
the breeding values of animal i, divide all animals into three groups:
1: All animals older than animal i (this includes the parents of animal i)
2: Animal i
3: All animal younger than animal i
These three groups of animals are used to partition Z, a,
Z = [ Z1 Z2 Z3 ]
a = [ a1 a2 a3 ]
and A-1:
A 1
 A 11

  A 21
 A 31

A 12
A 22
A 32
A 13 

A 23 
A 33 
Using this partitioning the MME can be written as:
 X' R 1X
X' R 1Z1

1
1
11
1
 Z1 ' R X Z1 ' R Z1  A  G
Z ' R 1X Z ' R 1Z  A 21  G 1
2
1
 2
Z 3 ' R 1X Z 3 ' R 1Z1  A 31  G 1

1
W' R 1Z1
 W' R X
X' R 1Z 2
Z1 ' R 1Z 2  A12  G 1
Z 2 ' R 1Z 2  A 22  G 1
1
Z3 ' R Z2  A  G
W' R 1Z 2
32
1
X' R 1Z 3
Z1 ' R 1Z 3  A13  G 1
Z 2 ' R 1Z 3  A 23  G 1
Z 3 ' R 1Z 3  A 33  G 1
W' R 1Z 3
  b   X' R 1y 
  
1 
  a1   Z1 ' R y 
 a   Z ' R 1y 
Z 2 ' R 1W
 2   2

1
 a 3   Z3 ' R 1y 
Z3 ' R W



W' R 1W  I 1  P 1   p   W' R 1y 
X' R 1W
Z1 ' R 1W
We want to partition a2 and therefore the part of interest is:
Z
2'R
1
X Z 2 ' R 1 Z 1  A 21  G 1
Z 2 ' R 1 Z 2  A 22  G 1
b
a 
 1
Z 2 ' R 1 W a 2   Z 2 ' R 1 y
 
a 3 
 p 

Z 2 ' R 1 Z 3  A 23  G 1

The Z matrices relate each observation to one animal and therefore Z 2 ' R 1 Z1 and Z 2 ' R 1 Z 3 are
null matrices which simplifies the equation to:
Z ' R Xb  A
1
21
2

This can be rewritten as:
Z ' R
1
2






 G 1 a 1  Z 2 ' R 1 Z 2  A 22  G 1 a 2  A 23  G 1 a 3  Z 2 ' R 1 W p  Z 2 ' R 1 y


 

Z 2  A 22  G 1 a 2  Z 2 ' R 1 y  Xb  Wp  A 21  G 1 a1  A 23  G 1 a 3
The matrix in the left hand side is the diagonal for the animal we want to partition proofs on as it
is used in the animal model. If we use:

D  Z 2 ' R 1 Z 2  A 22  G 1
Then we can calculate a2 as:





a 2  D1 Z 2 ' R 1 y  Xb  Wp  D1 A 21  G 1 a1  D1 A 23  G 1 a 3
In this calculation:
D 1 Z 2 ' R 1
y  Xb  Wp 

A
are the contributi ons from TD records
are TD records adjusted for all non - genetic effects

 are the contributi ons from breeding values in a
 D 1 A 21  G 1 are the contributi ons from breeding values in a 1
 D 1
23
 G 1
3
The three contribution matrices are large but most values are 0. There are only non-zero values
for TD records on animal i. A21 has only non zero values for the parents of animal i and A23 has
only non zero values for the progeny and mates of animal i. If both parents and all mates are
known then the sum of the elements in -A21 and -A23 is equal to A22, which means that the

weight on parent average vs. progeny is the same as in a single trait animal model. If both
parents and all mates are known and all animals have the same level of inbreeding then we get a
ratio of 1 on PA and #progeny/4 on progeny contributions. The total weight on breeding values
from other animals can be rewritten as D 1 A 22  G 1 (this would also account for mendelian
sampling influences from unknown parents/mates). Pedigree contributions can be partitioned
further based on the elements in -A21 and -A23.


Calculating weights for animals with test day records
In a single trait animal model the weights on the data and the weights on the pedigree
contributions are proportional to the genetic standard deviation of the contribution from each
source. Therefore, the weights on pedigree and data contributions in a multi trait model were
calculated from the genetic variance of each source which contributes to the solution.
The genetic (co)variance matrix of the data records for animal i is:
Z2GZ2'
The genetic (co)variance of the data contributions to each parameter is:
G dat  D -1 Z 2 ' R -1 Z 2 GZ 2 ' R -1 Z 2 D -1
The genetic (co)variance of the total pedigree contributions to each parameter is:



G ped  D-1 A 22G 1 G A 22G 1 D-1
These matrices can be pre and post multiplied with a weight vector to get the contributions to the
variance for traits of interest. For example for first lactation milk yield in the CTDM use the
weight vector:
W = [ 305 46665 19.5 0 .... 0 ]'
The standard deviations of the contributions are for:
S dat  W' G dat W
Data contribution:
Pedigree contribution: S ped  W' G ped W
The weight on data is: Wdat=Sdat/(Sdat+Sped)
The weight on pedigree is: Wped=Sped/(Sdat+Sped)
Calculating weights for progeny test day records
For bulls the main interest is the weight which is put on the progeny test day records relative to
all other sources which contribute to the bulls proof. This weight is affected by the weight on the
progeny breeding values relative to the parent average and by the weight which the test day
records received when the progeny breeding values were calculated. Both these weights were
calculated in the previous section.
We want to partition the contribution from the progeny breeding values to the breeding value of
a specific bull into three parts, the contribution from the parents Wpa (bull and mate), the
contribution from the daughters test day records Wdata (DATA), and contributions from any other
source Wother (this would be the grand progeny of the bull). The progeny contribution to the bulls
breeding value can be written as:


1 n
 2 EBV prog  EBV mate
n i 1




1 n
 2 Wpa * .5 * (EBV bull  EBV mate )  Wdata * DATA  (1  Wpa  Wdata ) * OTHER  EBV mate
n i 1

1 n
  Wpa * EBV bull  (Wpa  1) * EBV mate  Wdata * 2 * DATA  (1  Wpa  Wdata ) * 2 * OTHER
n i 1




1 n
 Wpa * EBV bull  Wdata * (2 * DATA  EBV mate )  (1  Wpa  Wdata ) * (2 * OTHER  EBV mate )
n i 1

 Wpa * EBV bull  Wdata * (PD)  (1  Wpa  Wdata ) * (PO)

Where PD  (2 * DATA  EBV mate )
PO  (2 * OTHER  EBV mate )
These contributions are multiplied by the weight on the progeny contributions (Wprog) in the bulls
breeding value. The bulls breeding value can then be calculated as:

EBV bull  (1  Wprog ) * .5 * (EBV sire  EBV dam )  Wprog * Wpa * EBV bull  Wdata * PD  (1  Wpa  Wdata ) * PO)
The weight on the test day records of progeny is:
Wprogdata 
Wprog Wdata
1  Wprog Wpa
Results
I have included an Excel file with graphs for weights on Milk yield and for combined Milk,
persistency and SCS and weights in the Lactation model. These graphs show the results for one
cow which has no progeny and both parents are known and not inbred. This cows starts with
only one TD record at 20 DIM and adds one TD records every 30 days, all TD records are from
24 hour recording with M, F, P and SCS observed. The weight which is put on the TD record(s)


for estimating each breeding value was calculated after each TD record was added. When
looking at the graph with weights on milk yield:

When the cow has her first TD records at 20 DIM this TD record receives a weight of .25
when estimating 1st lactation proof, a weight of .13 when estimating 3rd lactation proofs and
a weight of .19 when estimating combined proof.

When the cow adds her second TD record at 50 DIM there is a total weight of .48 on these
two test day records when estimating1st lactation yield, a weight of .38 when estimating
combined yield and a weight of .26 when estimating 3rd lactation milk yield.

When the cow has her first TD record in second lactation at 20 DIM she also has 10 TD
records in first lactation. There is a total weight of .51 on TD records when estimating 2nd
lactation milk yield, a weight of .43 when estimating 3rd lactation milk yield and a weight of
.63 when estimating combined milk yield. This is a total weight on all TD records and
weights on the 10 1st lactation records and the single 2nd lactation record are not calculated
separately.
Weights on the data for fat and protein yields show a similar trends as the weight on data for
milk yield but are lower. Weights on the data are higher than in the lactation model which is a
combination of the difference in models (test day records vs. lactation records) and a larger ratio
of genetic to residual variance in the test day model.
Weights on daughter records were calculated for a bull with both parents and all mates known
and not inbred. Weights on daughter data records when calculating combined milk yield were
estimated for some scenarios which were similar to those used by Liu et al. (Interbull bulletin no.
22, p81-87). Liu et al. used a different method to calculate weights and used the model and
parameters as they were used in the German test day model which is not a random regression
model.
No. of
daughters
50
100
100
100
No. of tests per lactation
First
3
8
10
10
Second
0
0
10
10
Third
0
0
0
10
Weight on data when
estimating combined
milk yield
.846
.938
.951
.958
Liu et. al results (8 tests
for a complete lactation)
Best Case
Worst Case
.56
.58
.59
.60
.72
.74
.95
.97
This table shows that the method described in this document estimates weights on the data which
are much higher as those reported by Liu et al. I think this is due to the difference in the way the
weights are calculated but differences in (co)variances and in the models used could also account
for (part of) the differences.
Download