Methods - Diabetes

advertisement
Appendix
Methods
NPL regression analysis
The NPL regression approach is a conditional logistic regression analysis in which the
family-specific NPL statistic (e.g. NPLpairs) at one or more loci are the predictor
variables. Consider a sample of m independent pedigrees and a chromosomal region with
one or more markers and a locus of interest. Let i be the pedigree-specific contribution
to the NPL statistic at the locus of interest. The likelihood function for a conditional
logistic regression with i as a predictor is
m
 exp{ yi i  } 
Lik (  ; yi , )   
.
i 1 1  exp{ i  } 
Here, yi  1 for all i and  is the conditional logistic regression parameter. It can be
shown that the score test from this likelihood is asymptotically equivalent to Whittemore
and Halpern’s (Whittemore AS, Halpern J: A class of tests for linkage using affected
pedigree members. Biometrics 50:118-127, 1994) class of tests (23,24). Although
unaffected individuals can be used to help estimate the possible inheritance vectors for
that pedigree, an NPL regression analysis is an “affecteds only” analysis. The primary
advantage of the NPL regression approach is that it allows us to evaluate simultaneously,
either by joint or conditional hypothesis tests, the effects of multiple loci (i.e.
heterogeneity) and test for interactions among sets of loci (e.g. epistasis). In addition, the
NPL regression approach allows for tests of whether the magnitude of sharing at a locus
varies by environmental or other phenotypic factors (gene-phenotype interactions) by
testing interactions between the degree of sharing (IBD) at a locus and the environmental
or other phenotypic characteristics using a single measure for each pedigree (e.g. mean
BMI). For each pedigree we include the pedigree’s NPL statistic at that locus, the mean
age at T2DM diagnosis or the mean BMI, and their statistical interaction.
Ordered subsets linkage analysis
If a subset of pedigrees that are phenotypically more homogeneous can be identified, it
might be possible to improve the power of our linkage analysis. Age of onset of T2DM
and BMI are two primary traits that may define phenotypically more homogeneous
subgroups of African Americans with T2DM. A series of ordered subset analyses (OSA)
(27-29) were computed to investigate the influence of a pedigree’s mean age at T2DM
diagnosis and mean BMI on linkage analyses. Ordered subset analysis (OSA) ranks each
family by the family-level value of a covariate of interest (e.g. mean BMI) and identifies
the contiguous subset of families that maximize the evidence for linkage. For example,
consider BMI. In the OSA the mean BMI values for each pedigree were ranked from
largest to smallest. The family with the smallest mean BMI entered into the analysis and
the corresponding LOD score was computed on the target chromosome (e.g. chromosome
7) for that family. Next, a second linkage analysis on the target chromosome was
computed combining the two families with the two smallest mean BMI values. The ith
OSA analysis proceeds by computing a linkage analysis on the target chromosome using
the subset of families with the ith smallest mean BMI. This process is repeated until all
families have been added to the linkage analysis. The subset of families that yield the
largest LOD score on the target chromosome is taken as the LOD score of interest. Note
that the location that maximizes the LOD score on a chromosome will vary as the subset
of families analyzed changes. The statistical significance of the change in the LOD score
was evaluated by a permutation test under the null hypothesis that the ranking of the
covariate is independent of the family’s LOD score on the target chromosome. Thus, the
families were randomly permuted with respect to the covariate ranking and an analysis
proceeded as above for each permutation of these data. The resulting empirical
distribution of the change in the LOD scores yielded a chromosome-specific p-value
(p). In this example the family-level means were ranked in ascending order, however
we repeated the analysis ranking in descending order. The chromosome-wide p-value
(p) is for the specific analysis conducted, i.e. families ranked in increasing or decreasing
order, but not both.
Download