Dear Prof Jörg Epplen, We would like to thank you and the

Dear Prof Jörg Epplen, We would like to thank you and the reviewers for the constructive criticism of our manuscript. We have implemented changes according to the reviewers’ comments and feel that our manuscript has improved. Below we are providing a detailed description of the changes we have made in accordance to the reviewers’ suggestions. We have also highlighted all major changes in the manuscript. Reviewer 1: We would like to thank the reviewer for their suggestions. We have made changes to our manuscript in accordance to their suggestions. More specifically: 1) There are a few scattered typos: We have corrected the typos 2) Please add information on the controls including chronic diseases. Table 1 is missing: We apologize for not including table 1. We have included it with the revised manuscript. Table 1a includes demographic information on cases and controls. Table 1b includes the genotype characteristics of cases and controls. Cases and controls were not matched for the presence of chronic diseases. We have added this in our methods section. Furthermore, in our discussion section we have added this statement: “Information on chronic diseases was not recorded for cases and controls with the possibility of discrepancy between the two groups. However FTO has only shown to play a role in obesity and we controlled for BMI. Therefore such discrepancy would likely not affect our findings.” 3) Elaborate on how you think these SNPs increase breast cancer risk via a demethylation process: Recent data has shown that increased FTO expression results in increased food intake, leading to increased adiposity. Thus, a gain-of-function effect is suggested for the implicated human allele. Overexpression of Fto caused a dose-dependent increase in body weight and fat mass[1]. This has been included in the discussion section of our manuscript. Reviewer 2: 1) Table 1 is absent: We have added table 1. We apologize for the omission. 2) Provide the LD patterns among the four SNPs: We calculated LD among 4 SNPs and also extracted LD values from HapMap. The results are shown in Table S1.The LD values obtained from our data were comparable with those from the HapMap project. We found these SNPs were in strong LD fro Caucasian and Asian samples but had reduced LD for Black samples. Due to the string LD among SNPs, traditional multiple logistic regression models cannot be used. Thus, we employed our Bayesian hierarchical logistic models, which can accommodate multiple SNPs with strong LD in the analysis. 3) The manuscript does not present standard association analyses for the four SNPs individually: We performed the traditional logistic regression for each SNP separately and presented the corresponding results in Table 2b. The analysis confirmed that rs1477196 was significantly associated with the risk of breast cancer through the additive effect after the Bonferroni correction for the multiple testing. 4) The results shown in Table 3 are not convincing without further information on the correlation patterns among the predictor variables in logistic regression analyses: This comment is related to the issue of identifiability for logistic (and other generalized linear) models. There are several reasons that a classical logistic regression can be non-identifiable (that is, have parameters that cannot be estimated from the available data or estimates can be unstable): 1) collinear among predictors, 2) separation, which arises when a predictor or a linear combination of predictors is completely aligned with the outcome, 3) many predictors, and 4) low frequencies for categorical predictors. These problems can occur in multiple-SNP and interaction analysis in genetic association analysis, because a) SNPs are usually in linkage disequilibrium, introducing correlated variables; b) SNP data often include genotypes with low frequencies that create predictors with little variation especially for interaction terms; c) Because SNP data are discrete, separation can be a serious problem in case-control association studies. Standard methods for overcoming these problems are penalized likelihood regressions or Bayesian modeling. The key to these approaches is the use of a penalty of model complexity (in penalized likelihood framework) or continuous prior distributions for genetic effects (in Bayesian framework) that constrains coefficients to lie in a reasonable range[2,3]. The Bayesian hierarchical models (or penalized regressions) are identified, and, thus, the resulting estimate is well defined and has finite variance, even if the original data have problems that would result in nonidentifiability of the maximum likelihood estimate. The Bayesian method of Gelman et al.[3] was particularly developed to deal with the above problem and can used in routine data analysis [2]. In this manuscript, we used the Bayesian method described in Yi and Banerjee [4]and Yi et al.[5] to perform multiple-SNP and interaction analyses. The method of Yi and Banerjee [4] and Yi et al. [5] is very similar to Gelman et al. [3], and thus results in well-identified models and produces stable estimate of coefficients. However, we found that for our data the multiple-SNP epistatic models are non-identifiable if we use classical logistic regression. 5) Please explain the ROC results here in the context of other studies: In the paper by Watcholder et al[6] et al a genetic model that included 10 SNPs found in GWAS to be associated with breast cancer risk produced an AUC of 58.9%. In that study the SNP-only model predicted risk slightly better than the Gail model. Another study by Mealiffe et al[7] also confirmed that a SNP-only model produced a higher AUC (58.7%) compared with the Gail model (55.7%). The AUCs in these studies are somewhat lower compared with the AUC obtained in our study. The major reasons for this prediction improvement are 1) we used a Bayesian hierarchical model to fit the data and to predict the disease risk. As described in Gelman et al. [3], a Bayesian hierarchical model can improve the prediction accuracy; 2) the previous studies have not included interactions into the predictive models. If genetic interactions are present, adding these interactions to a predictive model can increase the accuracy of prediction[8-11]. However we do recognize that this is a single study and we would need to validate our results before we can be certain of the magnitude of our findings. The above has been added in the discussion section of our manuscript. The above has been added in the discussion section of our manuscript. 6) Please discuss the control group response rate of 25%: healthy volunteers were approached in outpatient clinics. It is not uncommon for most of the individuals approached to decline participation in a genetic study. However the controls used in this study were selected from a pool of over 5000 individuals. This increases the validity of our study. 7) Address the multiple testing issues from Table 2: We calculated the adjusted p-values in Table 2 using the Bonferroni Correction for the multiple testing. For both the multiple-SNP and singleSNP analysis using the logistic regression, we showed that rs1477196 was significantly associated with the risk of breast cancer through the additive effect after the Bonferroni Correction at the nominal level of 0.05. Reference List 1. Church C, Moir L, McMurray F, Girard C, Banks GT, Teboul L et al.: Overexpression of Fto leads to increased food intake and results in obesity. Nat Genet 2010, 42: 10861092. 2. Gelman A, Hill J: Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press, New York; 2007. 3. Gelman A, Jakulin A, Pittau MG, Su YS: A weakly informative default prior distribution for logistic and other regression models. Annals of Applied Statistics 2008, 2: 1360-1383. 4. Yi N, Banerjee S: Hierarchical generalized linear models for multiple quantitative trait locus mapping. Genetics 2009, 181: 1101-1113. 5. Yi N, Kaklamani VG, Pasche B: Bayesian Analysis of Genetic Interactions in CaseControl Studies, With Application to Adiponectin Genes and Colorectal Cancer Risk. Annals of Human Genetics (in press) 2010. 6. Wacholder S, Hartge P, Prentice R, Garcia-Closas M, Feigelson HS, Diver WR et al.: Performance of common genetic variants in breast-cancer risk models. N Engl J Med 2010, 362: 986-993. 7. Mealiffe ME, Stokowski RP, Rhees BK, Prentice RL, Pettinger M, Hinds DA: Assessment of clinical validity of a breast cancer risk model combining genetic and clinical information. J Natl Cancer Inst 2010, 102: 1618-1627. 8. Yi N, Kaklamani VG, Pasche B: Bayesian analysis of genetic interactions in casecontrol studies, with application to adiponectin genes and colorectal cancer risk. Ann Hum Genet 2011, 75: 90-104. 9. Clark AG: Limits to prediction of phenotype from knowledge of genotypes. Limits to knowledge in evolutionary genetics. Edited by M Clegg et al. Kluwer Academic/Penum Publishers, New York; 2000:205-224. 10. Moore JH, Williams SM: Epistasis and its implications for personal genetics. Am J Hum Genet 2009, 85: 309-320. 11. Yi N. Statistical analysis of genetic interactions. Genetics Research . 2011. Ref Type: In Press

Dear Prof Jörg Epplen, We would like to thank you and the

Related documents

Products

Support

Dear Prof Jörg Epplen, We would like to thank you and the

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib