APPENDIX B: Details of the simulation study In this appendix, we

advertisement
APPENDIX B: Details of the simulation study
In this appendix, we explain the details of the simulations that are executed in this paper.
We consider 2 unlinked QTLs, where the recoded genotypes are additively coded as (-1, 0,
1). We note p1 as the MAF for QTL 1 and p2 as the MAF for QTL 2.
Scenario 1: No population stratification
We start each simulation by generating genotypic data for the QTLs according to the
specified MAFs and the Mendelian laws for the required number of subjects. Then we
need to simulate the trait values according to the different settings.
When no population stratification is present, the trait values are simulated based on splitting
up the (total) variance V of the trait. We suppose that
V=Va1+Va2+Vaa+Ve+Vp
where Vai is the additive major gene variance explained by QTL i, Vaa is the additive
genetic variance explained by the gene-gene interaction between the two QTLs, Ve is the
(non-shared) environmental variance and Vp is the polygenic variance. To simulate the trait
values, we start by drawing values of the univariate normal distribution with mean zero and
variance Vp for all founders (N(0, Vp)). For all offspring, we generate values independently
based on an univariate normal distribution with mean equal to the mean of the parental
(polygenic) generated values and variance equal to Vp/2. Generating the values this way for
the polygenic component of the trait value will result in a polygenic variance Vp of the trait
and covariance values between trait values of family members of φ Vp , where φ equals
twice the kinship coefficient. We can show this as follows.
Assume X multivariate normally distributed, with parameters
and
We define
Drawing from AX will correspond to simulating the polygenic component for a family with
2 parents and 2 offspring in the way we described above.
By a general property of the multivariate normal distribution it follows that AX is
multivariate normally distributed with mean
and variance
The correlation matrix is thus twice the kinship matrix as stated before.
To further simulate the trait values, we add to all simulated values randomly drawn values
of the univariate normal distribution with mean zero and variance Ve for each subject (N(0,
Ve)).
In a final step, we add values based on the following model:
E[Yi | G1i , G2i ]    a1G1i  a2 G2i  a12G1i G2i .
(1)
We can calculate a1, a2 and a12 from the given additive genetic variances Va1, Va2 and Vaa.
The following formulae come from 1:
Va1 = 2p1(1-p1)[a1+(p2-(1-p2))a12] 2
Va2 = 2p2(1-p2)[a2+(p1-(1-p1))a12] 2
Vaa = 4p1(1-p1)p2(1-p2)a122
From those, we can derive that:
In model (1), we specify values a1, a2 and a12 according to these formulae and put µ=0.
We calculate E[Yi | G1i , G2i ] based on model (1) for each subject and add these values to the
already simulated trait values based on Vp and Ve. Adding this part of the trait value will
add Va1+Va2+Vaa to the variance of the trait values and π1Va1+ π2Va2+ π1  π2Vaa to the
covariance of the trait values of 2 individuals in the family where π1 is the proportion of
alleles shared IBD at locus 1 (analogous for π2) and  is the Hadamard product (elementwise multiplication) between 2 matrices.
We conclude that simulating the trait values as described above, will result in the following
variance-covariance matrix for the trait values of individuals j and k of family i:

V  V a 2  V aa  V p  Ve
 ijk   a11
2
1
2

 ijkV a1   ijkV a 2   ijk   ijkV aa   ijkV p
( j  k)
( j  k)
as is described in the paper.
We choose to simulate the data this way, for comparison with the QTDT paper 2.
Scenario 2: Population Stratification
When we are simulating data in presence of population stratification, we also start by
simulating genotypic data according to the specified MAFs of the QTLs and the
Mendelians laws for all subjects. The difference is that now the MAFs of the QTLs differ
according to the stratum. We simulate two strata and divide the families in an equal amount
over the 2 strata. We do not take admixture into account. The specified MAFs are
0.1/0.3/0.4 for stratum 1 and 0.5 for stratum 2.
To simulate the trait values, we specify the effects a1, a2 and a12 in model (1) instead of the
variance decomposition of the trait. This way, we can measure the bias in the estimated
coefficients of the epiQTDT method. We keep these parameters fixed for the 2 strata, but
specify µ=1 for stratum 1 and µ =10 for stratum 2. Furthermore, we also specify the
variances Ve and Vp .
The simulation of the trait values is analogue to the case of no population stratification,
based on the described a1, a2 , a12, Ve and Vp . We note that the additive genetic variances
Va1, Va2 and Vaa will differ for the 2 strata, since the MAF of both QTLs is different.
1.
Tiwari HK: Deriving components of genetic variance for multilocus models.
Genetic Epidemiology 1997; 14: 1131-1136.
2.
Abecasis GR, Cardon LR, Cookson WO: A general test of association for
quantitative traits in nuclear families. Am J Hum Genet 2000; 66: 279-292.
Download