Association of FGFR2 gene polymorphisms with the risk of breast cancer in population of West Siberia Uljana A. Boyarskikh1, Natalja A. Zarubina2, Julia A. Biltueva1, Tatjana V. Sinkina2, Elena N. Voronina1, Aleksander F. Lazarev2, Valentina D. Petrova2, Yurii S. Aulchenko3, Maxim L. Filipenko1*. 1 Institute of Chemical Biology and Fundamental Medicine, 630090 Novosibirsk, Russia 2 Altai Branch of Russian N.N. Blokhin Cancer Research Centre, 656049 Barnaul, Russia 3 Erasmus MC Rotterdam, 3000 CA Rotterdam, The Netherlands Supplementary Material and Methods Study participants The case group included 766 and the control group included 665 women. All the women signed an informed consent to participate in the study. The study protocol was approved by Medical Ethics Committee of Altai Oncological Centre. The case group included patients with a histologically verified diagnosis of breast cancer. As we focused on sporadic breast cancer, the following cases were not included in the study: those who had more than two first-degree relatives with breast and/or ovarian cancer; those who had age at onset before 40 years old; those with both breast and ovarian cancer; those with bilateral breast cancer; those with the 5382insC, 300A/C, 4153delA and 185delAG mutations in the BRCA1 gene. The control group included women randomly selected from a list of blood bank donors. The case and control groups were age-matched, with an average age of 54.7 ± 8.5 and 54 ± 12.5 years respectively. All the women were ethnic Russians residing in Altai Krai, the Russian Federation. The groups were formed as part of the epidemiological studies carried out by the Altai Branch of Russian N.N. Blokhin Cancer Research Centre and the Altai Oncological Centre. SNP selection Our choice of SNPs was based on the results of two independent studies1,2 in which genome-wide screening of the polymorphic loci associated with breast cancer was reported. We have chosen seven SNPs. Three of these SNPs (rs3135718, rs2981582 и rs7895676) were in strong linkage disequilibrium and demonstrated association with breast cancer in the study of Easton et al1. To study whether in Russian population the LD block is the same as in other European populations, the other four SNPs were chosen to flank the European LD block at the 5’- and 3’-ends. In the 5’-region, we selected rs4647913, which is located in intron 1 and is likely to reside outside of the LD block of intron 2, and rs3135715, which is located in intron 1 and is in linkage disequilibrium with rs2981582 (Supplementary Table 1B). In the 3’-region, we selected rs2981428 and rs10510097. These SNPs are located in intron 2 and are not contained in the LD block. Another reason to choose rs10510097 was that a statistically significant association with breast cancer was found for rs10510097 at one of the stages of the study conducted by Easton and co-workers1. DNA isolation and genotyping Venous blood samples were taken from all study participants. DNA was prepared from whole blood samples using a standard method, which included separation and lysis of the nuclear cells, treatment with proteinase K, extraction with phenolchloroform and ethanol precipitation. DNA concentrations were determined using a Fluorescent DNA Quantitation Kit (Bio-Rad, USA). The DNAs were genotyped for SNPs in introns 1 and 2 of the FGFR2 gene (rs4647913, rs3135715, rs3135718, rs2981582, rs7895676, rs2981428, rs10510097). Two SNPs, rs7895676 and rs2981428, were genotyped in part by allelespecific real-time PCR and in part by direct sequencing. Allele-specific amplification was performed in a volume of 25 μL, the PCR-mix included 300 nM primers, 10 mM TrisHCl (рН 8.9), 55 mM KCl, 2.5 mM MgCl2, 0.05% Tween-20, 0.2 mM dNTP, SYBR Green I [1:25000], 20-100 ng of DNA and 0.5 activity units of Taq DNA polymerase (hot start, no 5’-nuclease activity, Biosan, IСBFM). Each sample was amplified with two pairs of primers. Each pair contained one common primer and one complementary to the sequence of the corresponding allele. Genotyping was performed based on the ΔCt value obtained following amplification reactions with the corresponding primer pairs. DNA sequencing was performed using a BigDye® Terminator Kit v1.19 (Applied Biosystems, USA) according to the manufacturer's protocol. The SNPs in the FGFR2 gene (rs4647913, rs3135715, rs3135718, rs2981582, rs10510097) were genotyped by real-time PCR using TaqMan probes corresponding to the polymorphic DNA sequence. Amplification was performed in a volume of 25 μL, the PCR-mix included 300 nM primers, 100 nM TaqMan probes, 65 mM TrisHCl (рН 8,9), 16 mM (NH4)2SO4, 2,5 mM MgCl2, 0,05% Tween-20, 0,2 mM dNTP, 20100 ng of DNA and 0.5 activity units of Taq DNA polymerase (hot start, Biosan, ICBFM). The sequences of the oligonucleotide primers and TaqMan probes used are provided in the Supplementary Table 4. To check the SNP typing results for reproducibility, about 25% of the DNA samples were assayed twice, with no discrepancy found at any point. The percentage of samples for which PCR failed varied from 0.9% to 4.1% depending on the SNP. Statistical analysis All statistical analyses were performed using free R-2.6.0 software (http://www.rproject.org) and associated libraries GenABEL3, haplo.stats4, genetics, rmeta. Deviation of genotypic distribution from Hardy-Weinberg equilibrium was tested with exact test5 in the total population of cases and controls. Linkage disequilibrium (Lewontin’s D’ and conventional r2) was estimated using the “genetics” library for R. Odds ratios (OR) and 95% confidence intervals for the association between the FGFR2 SNPs and breast cancer were estimated using logistic regression. To determine the inheritance model for SNPs rs3135718, rs2981582, rs7895676 the additive, dominant and recessive models were tested against the general genotypic model. Testing was done using the Likelihood Ratio Test (LRT): twice the difference between maximum log-likelihoods of the general and a nesting model is distributed as chi-square with the number of degree of freedom equal to the difference in the number of parameters estimated under the general and the nesting model. Additionally, to choose the model providing the best fit to the data we applied Akaike’s information criterion defined as AIC = 2 (k – loge(L)) where k is the number of parameters estimated in the model and L is the maximum likelihood. The model having minimal AIC value provides the best fit. We have tested whether all three SNPs (rs3135718, rs2981582, rs7895676) contribute to breast cancer susceptibility. For that we have used multivariable logistic regression model including genotypes of three SNPs as independent variables. Models including either or pairs of SNPs were contrasted to the general model including all three SNPs. Significance testing was performed using LRT and the best-fitting model was estimated using AIC. Haplotype frequencies and their OR were estimated using «haplo.score» and «haplo.glm» function of “haplo.stats” package for R4. Meta-analysis of log(OR) was performed using inverse variance method, with study weights set to reciprocal of the square of the standard error. Significance of differences between log(OR) was estimated using Z-test. We have tested if the odds ratio for the rs2981582 [T] observed in the population of West Siberia deviate significantly from the estimates obtained elsewhere. Easton and colleagues1 report OR = 1.27 (p = 8 x 10-16) from the metaanalysis of European and OR = 1.19 (p = 2 x 10-6) from the meta-analysis of Asian populations; additionally, Liang and colleagues6 have recently reported OR = 1.25 (p = 0.0009) in a Chinese population. Combining the results of Easton and Liang on Asian samples yields an estimate of OR = 1.2 (p = 8 x 10-9). The OR obtained in this study is significantly (p = 0.02) higher than that observed in the Asian populations, while it is not significantly different from that obtained in the European samples (p = 0.1). References 1. Easton DF, Pooley KA, Dunning AM et al: Genome-wide association study identifies novel breast cancer susceptibility loci. Nature 2007; 447: 1087 – 1093. 2. Hunter DJ, Kraft P, Jacobs KB et al: A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer. Nat Genet 2007; 39: 870 – 874. 3. Aulchenko YS, Ripke S, Isaacs A, van Duijn CM: GenABEL, an R library for genome-wide association analysis. Bioinformatics 2007; 23(10): 1294 – 1296. 4. Schaid DJ, Rowland CM, Tines DE, Jacobson RM, Poland GA: Score tests for association between traits and haplotypes when linkage phase is ambiguous. Am J Hum Genet 2002; 70(2): 425 – 434. 5. Wigginton JE, Cutler DJ, Abecasis GR: A note on exact tests of HardyWeinberg equilibrium. Am J Hum Genet 2005; 76(5): 887 – 893. 6. Liang J, Chen P, Hu Z et al: Genetic variants in fibroblast growth factor receptor 2 (FGFR2) contribute to susceptibility of breast cancer in Chinese women. Carcinogenesis 2008; 20(12): 2341 – 2346.