Link between non-centrality parameter and effect size for the
region-based score test
Adopting the notations as in the article, let us consider the element of the score vector ππ =
π
∑π=1(π¦π − πΜ
)(πππ − πΜ
π ) , π = 1, … , πΏ. Denote the number of cases, controls and total sample
size as π π΄ , π π and π = π π΄ + π π . After algebraic transformations it can be shown that:
ππ = 2π π΄ π π (ππ+ − ππ− )/π
(1)
where ππ+ and ππ− are the observed MAF in cases and controls, respectively, for the πth SNP.
Then, the vector π = πΆπ is asymptotically distributed as a multivariate random vector with the
unit covariance matrix and mean πΆπΈ(π), where πΈ(π) is the mathematical expectation of the
score vector, which can be written as follows:
πΈ(π) = πΈ({ππ }πΏπ=1 ) = {2π π΄ π π (πππ+ − πππ− )/π}πΏπ=1 ,
(2)
where πππ+ and πππ− are population MAF in cases and controls of the πth variant. If we denote as
π
π relative risk of the πth SNP and assume low prevalence of a disease it follows [1]:
πππ+
π
π πππ−
=
.
(1 + (π
π − 1)πππ− )
(3)
The score test statistic is the sum of squares of elements of vector π. If we define vector π =
{ππ }πΏπ=1 = πΈ(π) = πΆπΈ(π), then the non-centrality parameter (NCP) of the score test statistic
under the alternative hypothesis is:
πΏ
π = ∑ ππ2
(4)
π=1
Under the null hypothesis of no variant being associated with a phenotype, which is equivalent to
π
π = 1, π = 1, … , πΏ, it follows from (3) that πππ+ = πππ− , which implies from (2) πΈ(ππ ) = 0, π =
1, … , πΏ and π = πΆπΈ(π) = {0}πΏπ=1; thus, π = 0.
-1-
Description of the assumptions for illustration of connection
between non-centrality parameter and effect size
From the considerations above, it can be seen that NCP π is a function of the number of cases π π΄
and controls π π in the study, relative risk of each variant π
π , π = 1, … , πΏ; population MAF in
controls πππ− , π = 1, … , πΏ; and covariance matrix of the score test statistic π (since matrix πΆ =
(π΄π )−1 , where π = π΄π π΄). To illustrate the dependence between NCP and relative risk let us
assume independence of variants within the region, which implies the matrix π is diagonal.
Thus, π = ππππ({π£π }πΏπ=1 ) where π£π = π£ππ(ππ ) is variance of the πth SNP in our sample. It
follows that π£ππ(ππ ) = 2π π (1 − πππ− )πππ− + 2π π΄ (1 − πππ+ )πππ+ , which is variance of the sum
of two independent binomial random variables with the number of draws 2π π and 2π π΄ and the
probability of success πππ− and πππ+ respectively. It follows:
C = ππππ(1/√2π π΄ (1 − πππ+ )πππ+ + 2ππ (1 − πππ− )πππ− , π = 1, … , πΏ).
(5)
So, given population MAF of causal variants in controls πππ− , relative risk of causal variants π
π ,
the number of cases π π΄ and controls π π , we can calculate the corresponding NCP π according
to the following algorithm:
1. calculate πππ+ – population MAF in cases from (3)
2. calculate πΈ(π) – the expectation of the score vector π from (2)
3. obtain matrix πΆ from (5)
4. calculate vector π = {ππ }πΏπ=1 = πΆπΈ(π)
5. obtain NCP π from (4).
For the purpose of illustration, let us assume π π΄ = π π = 500, population MAF and relative risk
of all causal variants are equal. Additional File 7 depicts the non-centrality parameter (vertical
-2-
axis) as a function of relative risk (horizontal axis) and the number of causal variants (lines
within each panel). Population MAF of causal variants in controls was the following: Panel 1 –
1%, Panel 2 – 0.5%, Panel 3 – 0.25%, Panel 4 – 0.125%. As can be seen, the non-centrality
parameter monotonically increases with increasing relative risk, population MAF in controls and
the number of causal variants within a region.
References
1.
Sul JH, Han B, He D, Eskin E: An optimal weighted aggregated association test for
identification of rare variants involved in common diseases. Genetics 2011,
188(1):181-188.
-3-