Supplementary Methods (doc 28K)

advertisement
Supplemental Methods for the article:
CHEK2*1100delC homozygosity in the Netherlands –
prevalence and risk of breast and lung cancer
Petra E.A. Huijts1, Antoinette Hollestelle2, Brunilda Balliu3, Jeanine J.
Houwing-Duistermaat3, Caro M. Meijers1, Jannet C. Blom2, Bahar Ozturk2,
Elly M.M. Krol-Warmerdam4, Juul Wijnen5, Els M.J.J. Berns2, John W.M.
Martens2, Caroline Seynaeve2, Lambertus A. Kiemeney6, Henricus F. van
der Heijden7, Rob A.E.M. Tollenaar4, Peter Devilee1, Christi J. van
Asperen5.
1 Department of Human Genetics, Leiden University Medical Center, Leiden
2 Department of Medical Oncology, Erasmus University Medical Center, Rotterdam
3 Department of Medical Statistics and Bioinformatics, Leiden University Medical
Center, Leiden
4 Department of Surgery, Leiden University Medical Center, Leiden
5 Department of Clinical Genetics, Leiden University Medical Center, Leiden
6 Department of Epidemiology, Biostatistics & HTA, Radboud University Medical
Center, Nijmegen, The Netherlands
7 Department of Pulmonary Diseases, Radboud University Medical Center, Nijmegen,
The Netherlands
Corresponding author: C.J. van Asperen, MD, PhD
Postal address: Department of Clinical Genetics, Postbus 9600, 2300 RC Leiden, the
Netherlands.
E-mail address: asperen@lumc.nl
Telephone: 003171 526 6090
Fax: 003171 526 6749
Statistics
The program R 2.14.1 was used to conduct the statistical analysis.
In order to use the information for all the studies available, we used a combined
approach1. We used a joint likelihood approach (Lc) to model the cases (the sporadic
breast cancer cohort) and the controls from the blood bank, meaning that we model
jointly the disease status (Y) and genotypes (Gc) of individuals. Lc consists of two parts:
the disease probability of an individual conditional on his genotype P(Y=1|Gc) and the
distribution of the genotypes P(Gc). Thus Lc= P(Y=1|Gc)  P(Gc).
The probability of disease P(Y=1|Gc) for each individual is modeled using a logistic
regression. From this part the OR for the homozygous and heterozygous individuals are
estimated. The genotype distribution P(Gc) is modeled assuming Hardy-Weinberg
equilibrium. To protect against deviations from Hardy-Weinberg equilibrium we use the
disease prevalence p=5% to put a different weight on the genotype distributions of cases
and controls. To test the robustness of our method for different values of prevalence we
repeated the analysis for prevalence values between 1% and 10%. Similar results were
obtained with prevalences between 1% and 10%.
To obtain efficient estimates of the allele frequency and of the genotypic relative risks, in
the model (Lu) we included the individuals with unknown disease status, that is
individuals from the CF cohort and individuals from families with hereditary disease
unrelated to cancer, not affected by the disease running in their family. Since these
individuals have been collected as controls for studies other than breast cancer, we do not
have exact information about their disease status regarding breast cancer. Since we
cannot treat them as controls, we cannot use them to estimate the OR. However, we can
use the information they contain regarding the allele frequency. This model consists only
of the genotype distribution P(Gu) where Gu are the genotypes of these individuals. To
model this genotype distribution we assume Hardy-Weinberg equilibrium.
The two models (Lc and Lu) are then combined to obtain a common model for all the
data sets together. Efficient estimates of all parameters are obtained by using this method,
also this method enables estimation of the OR of the homozygotes by modeling the
number of homozygotes using the allele frequency.
Reference List
1. Brunilda Balliu, Roula Tsonaka, Diane van der Woude, Stefan Boehringer, Jeanine J.
Houwing-Duistermaat. Combining Family and Twin Data in Association Studies to
Estimate the Noninherited Maternal Antigens Effect. Genet. Epidemiol. 2012 DOI10.1002/gepi.21667
Download