Supplementary Methods (doc 88K)

Supplementary Methods (online data)
1. Criteria or definitions of COPD and lung cancer
Lung function was tested with the EasyOne Spirometer (EasyOne Spirometer, ndd
Medizintechnik AG, Switzerland). Subjects with forced expiratory volume in one second
(FEV1)/forced vital capacity (FVC) of <70% after inhalation of 400μg salbutamol and with some
chronic airway symptoms, such as chronic cough, dyspnea, sputum production or wheezing were
identified as COPD cases. Lung cancer was diagnosed according to standard clinical criteria with
pathologic confirmation. Main analyses included all primary lung cancer cases regardless of
histological type. Lung cancer patients or control subjects with physician-diagnosed COPD by
the Spirometry test at least one year before lung cancer diagnosis or control recruitment were
defined to have been potential pre-existing COPD.
2. Pulmonary function test
Pulmonary function test was performed by trained technicians with a EasyOne Spirometer
(EasyOne Spirometer, ndd Medizintechnik AG, Switzerland) according to the criteria
recommended by the American Thoracic Society1 and the Europe Respiratory Society2. The
largest pre-bronchodilator FEV1 and FVC were chosen from at least three acceptable and two
reproducible measurements that met the criteria for each individual. The post-bronchodilator
Spirometry was obtained after inhalation of 400μg of salbutamol for 20 minutes (Ventolin,
GlaxoSmithKline) via a 500 ml spacer for those participants whose pre-FEV1/FVC <70%.
Furthermore, the predicted values of FEV1 were estimated as the previous study reported3.
3. COPD Sample collection
For COPD cases and controls, 697 COPD cases of southern Chinese were recruited from
three communities (Liwan, Xicun and Zhanqian communities) in Guangzhou city based on the
cross-sectional surveys of COPD between February 2002 and June 2008 with a response rate of
about 81%; 328 cases were recruited from municipal hospitals in Guangzhou (the Guangzhou
Chest Hospital, The third Affiliated Hospital of Guangzhou Medical University, the third
Affiliated Hospital of Sun Yat-sen University) between December 2007 and June 2010 with a
91% response rate. The 1061 normal lung function controls (pre-FEV1/FVC >70%) were
randomly picked from 6000 individuals participating in the community cross-sectional survey of
COPD. The 766 cases of COPD in eastern Chinese were recruited the Second Affiliate Hospital
of Soochow University (Suzhou) with a response rate of 83% during the July 2009 to June 2011.
The 879 normal pulmonary function controls were randomly selected from a database consisting
of 3,500 individuals based on a physical examination with an 81% response rate. All individuals
are Chinese Han and none had blood transfusion in the last six months. The lung function of
subjects recruited from Guangzhou community has been follow-up for years, and 427 subjects
fully completed the four years follow-up with annual spirometric detection.
4. Lung cancer sample collection
The lung cancer cases and controls have been described previously4-6. In brief, 1056
histopathologically confirmed cases with primary lung cancer were recruited at four urban
hospitals and at one suburb hospital of Guangzhou city from March 2007 to March 2009, with an
overall response rate of about 95%. 1056 age (±1 years) and sex frequency-matched cancer-free
controls were randomly picked from about 10,000 participants of healthy check-up programs
conducted in Guangzhou City during the same period with a response rate of 84%. An eastern
Chinese population was used as a validation with 1016 newly diagnosed lung cancer patients who
were continually recruited from March 2008 to May 2012 at the First Affiliate Hospital of
Soochow University (Suzhou) with a response rate of 87% and 1021 age (±5 years) and sex
frequency-matched controls who were selected from a database consisting of 3500 individuals
based on a physical examination with a 81% response rate. All lung cancer patients were
followed-up at every 3 months by telephone until the end of current study. Date of death or
survival was obtained from medical records or from patients’ families. Excluding the lost subjects,
570 patients with complete survival data from the southern Chinese and 569 cases from the
eastern Chinese were used for analysis of lung cancer survival.
5. Definitions of co-variables
Those participants who had smoked <100 cigarettes in their lifetime were defined as never
smokers; otherwise, they were classified as ever smokers. Pack-year smoked was classified into
three types: 0, 0-20 and >20. Meanwhile, biomass as fuels was defined to those families who
majorly used firewood to cook and get warm.
6. Taqman qPCR method
We genotyped CHRNA7 CNV-3956 of all 7880 subjects using Taqman qPCR method
according to the instruction of Applied BioSystems
.html): in 10µl reaction systems, triplicate 10-ng DNA samples from each individual were
evaluated using TaqMan PCR Master Mix (Applied Biosystems, Foster City, CA, USA) along
with the control RNase P probe (VIC labeled, Applied Biosystems) and the experimental probe
(FAM labeled, cat# Hs00059157 especially for CNV-3956, Applied Biosystems). The wells
without genomic DNA but with TaqMan PCR Master Mix and probes were defined as zero copy
number. 3 genomic DNA samples with two copies per diploid genome in the CNV-3956 locus
found by Accucopy assay was selected as standard samples. Samples were run on an ABI
7900HT fast real-time PCR System (Applied Biosystems) using Sequence Detection Software
(SDS, version 2.3, Applied Biosystems). Reactions were held at 95 °C for 10 min and then cycled
40 times through 95 °C for 15 s and 60°C C for 1 min. The genotypes were automatically
determined by the software CopyCaller 2.0 (Applied BioSystems) (Supplementary Figure S2A).
For quality control, the square of Pearson correlation coefficient (R2) for copy number of each
triplicate >95% was accepted. Case and control samples were mixed and performed together in
each plate. A 10% random sample of cases or controls was tested twice by different investigators.
7. Accucopy assay
The Accucopy assay (a multiple competitive real-time PCR) was used to calculate copy
number of CNV-3956 in 200 randomly selected samples by a commercial company (Genesky
Bio-Tech Co., Ltd., Shanghai, China). Briefly, the CNV fragment was measured by a patented
Multiplex AccuCopyTM Kit (Genesky Bio-Tech Co., Ltd., Shanghai, China) in which three
reference genes (POP1, RPP14 and TBX15) were utilized for normalization. A 20μl PCR reaction
was prepared for each sample, containing 1x AccuCopyTM PCR Master Mix, 1x Fluorescence
Primer Mix, ~10ng sample DNA, and 1x Competitive DNA mix (Each competitive DNA
sequence is almost same as the human homology with insertion or deletion of several base pairs.).
The PCR program was described as followed: 95℃ 10min; 11 cycles of 94℃ 20s,
65℃-0.5℃/cycle 40s, 72℃ 1.5min; 24 cycles of 94℃ 20s, 59℃30s, 72℃ 1.5min; 60℃ 60min;
4℃ forever. PCR products were diluted 20-fold before loaded on ABI3730XL sequencer
(Applied Biosystems). Data were analyzed by GeneMapper4.0: the sample/competitive (S/C)
peak ratio was calculated for the two CNVs fragment and three reference genes (POP1, RPP14
andTBX15); then the S/C ratio for each target fragment was first normalized to three reference
genes respectively. The three normalized S/C ratios were further normalized to the median value
in all samples for each reference gene respectively and then averaged. If one of the three
normalized S/C ratios deviated more than 25% from the average of the other two, it was excluded
for further analysis. The copy number of two CNVs fragment was determined by the average S/C
ratio times 2 (Supplementary Figure S2B).
8. CHRNA7 expression detection
The mRNA level of CHRNA7 was relatively quantitated to -actin using the 2-ΔCT Method
with self-designed primers: 5’-TGG GTC CTG GTC TTA CGG-3’ (forward) and 5’-CAC TAG
GTC CCA TTC TCC ATT-3’ (reverse) for CHRNA7, as well as primers: 5’-GGC GGC ACC
ACC ATG TAC CCT-3’ (forward) and 5’-AGG GGC CGG ACT CGT CAT ACT-3’ (reverse)
for -actin.
9. Heritability calculation
The REML model was used to assess the heritability explained by the above genetic
variants7. The linear mixed-effects model with regard to unrelated individuals was applied to
estimate the genetic variance attributable to the CNV-3956 based on recessive model (i.e., the
proportion of disease heritability explained by the variant, defined as hg2). hg2 was estimated as
hg2 = varg/(varg+vare), where varg and vare are the genetic and residual variance components
estimated by the REML model using unrelated individuals.
10. Test of mediation effect
The statistical protocol and parameters were calculated with the excel procedure
‘Spreadsheet’ according to the instruction on the website
(, and the Sobel test was used to test the
significance of mediation effect by using the website tool (Interactive calculation tool: Firstly, by using the unconditional Logistical regression
model with adjustment for five covariates (i.e., age, sex, pack-years smoked, biomass fueling and
centers), two regression coefficients (defined as b and c’) was obtained by regressing the outcome
(i.e., lung cancer) on the mediator (i.e., COPD) and the risk factor (i.e., CNV-3956), because no
significant interaction was observed between CNV-3956 and pre-existing COPD on increasing
lung cancer risk. Secondly, by using an extended inverse-propensity-weighted (IPW) regression
approach as suggested by Richardson et, al. and Wang et, al.8,9 , one coefficient (defined as a)
was calculated by regressing CNV-3956 on COPD. Thereinto, when the cases were weighted as 1,
the weight for controls was given as a ratio of [(N1× (1 − fD))/(N0×fD): N1 and N0 were the
number of cases and controls in current case-control study, respectively, and fD was the
prevalence of lung cancer in total population; in current study, the controls were weighted as
1559× (1-0.0005357) / (1679×0.0005357) = 1732]. Once more, all above coefficients were turn
into be comparable (defined as α, β and τ′) by multiplying each coefficient by the standard
deviation (SD) of the predictor variable and then dividing by the SD of the outcome variable as
suggested by MacKinnon and Dwyer10. That is, α = a × SD(CNV-3956)/SD(COPD); β = b ×
SD(COPD)/SD(lung cancer); and τ′ = c’ × SD(CNV-3956)/SD(lung cancer). Finally, the direct
effect of risk factors on lung cancer risk was equal to τ′, and the indirect effect of risk factors on
lung cancer risk through COPD was equal to αβ. The total effect τ = τ′ + αβ, and the proportion
for mediation effect was αβ/τ. In addition, the Sobel test was performed with the parameters, α, β
and their standard errors with the website tool (Interactive calculation tool:
11. multiplicative interaction analysis
A multiplicative interaction was suggested when OR 11 > OR 10 × OR 01, in which OR 11
= the OR when both factors were present, OR 01 = the OR when only factor 1 was present, OR
10 = the OR when only factor 2 was present. In the current study, the factor 1 refers to the
selected variable such as age, sex, smoking, pack-years smoked, biomass as fuels and the factor 2
refers to the copy number of CNV-3956.
Miller MR, Hankinson J, Brusasco V et al: Standardization of Spirometry, 1994 Update.
American Thoracic Society. Am J Respir Crit Care Med 1995; 152: 1107-1136.
Pellegrino R, Viegi G, Brusasco V et al: Interpretative strategies for lung function tests.
Eur Respir J 2005; 26: 948-968.
Zheng J, Zhong N: Normative values of pulmonary function testing in Chinese adults.
Chin Med J (Engl) 2002; 115: 50-54.
Lu J, Yang L, Zhao H et al: The polymorphism and haplotypes of PIN1 gene are
associated with the risk of lung cancer in Southern and Eastern Chinese populations. Hum
Mutat 2011; 32: 1299-1308.
Liu B, Yang L, Huang B et al: A functional copy-number variation in MAPKAPK2
predicts risk and prognosis of lung cancer. Am J Hum Genet 2012; 91: 384-390.
Yang L, Li Y, Cheng M et al: A functional polymorphism at microRNA-629-binding site
in the 3'-untranslated region of NBS1 gene confers an increased risk of lung cancer in
Southern and Eastern Chinese population. Carcinogenesis 2012; 33: 338-347.
Vattikuti S, Guo J, Chow CC: Heritability and genetic correlations explained by common
SNPs for metabolic syndrome traits. PLoS Genet 2012; 8: e1002637.
Richardson DB, Rzehak P, Klenk J, Weiland SK: Analyses of case-control data for
additional outcomes. Epidemiology 2007; 18: 441-445.
Wang J, Shete S: Estimation of odds ratios of genetic variants for the secondary
phenotypes associated with primary diseases. Genet Epidemiol 2011; 35: 190-200.
MacKinnon DP, Dwyer JH: Estimating mediated effects in prevention studies. Evaluation
Review 1993; 17: 144-158.
Figure Titles and Legends
Supplementary Figure S1. Chromosome location of the two common CNVs (i.e., CNV-3956
and CNV-32018) covering the CHRNA7gene. As shown, the brown means a complex genotype
namely gain/loss of CNV-3956. The red means a loss genotype of CNV-32018. The CNV-3956
completely contained the CHRNA7 gene, while the CNV-32018 contains the last exons and
introns of CHRNA7 gene.
Supplementary Figure S2. Genotyping of the CNV-3956. A, by the Taqman assay. B, by the
Accucopy assay. As shown, four genotypes that are 2-copy, 3-copy, 4-copy and 5-copy were
found. We combined the 4-copy and 5-copy genotype as ≥4-copy because the frequencies of
them are relatively low.
Supplementary Figure S3. Stratification analysis of the associations between the CNV-3956
genotypes and risks of COPD and lung cancer by selected factors including age, sex, smoking
status, pack-years smoked, biomass as fuels, stages and histological types. A, effect of the
CNV-3956 on COPD risk; B. effect of the CNV-3956 on lung cancer risk. A multiplicative
interaction model was supplied for the interaction analysis. The ≥4-copy of CNV-3956 exerted an
intuitively higher OR in individuals with pre-existing COPD (OR =2.01) than those without (OR
=1.42). Moreover, the ≥4-copy significantly interacted with smoking on increasing risks of both
diseases (P =0.006 for COPD; P =0.003 for lung cancer).
Supplementary Figure S4. Stratification analysis of the association between the CNV-3956 and
lung cancer survival by selected factors including age, sex, smoking status, pack-years smoked,
biomass as fuels, stages and histological types. A multiplicative interaction model was supplied
for the interaction analysis. As shown, the adverse effect of ≥4-copy of CNV-3956 on cancer
survival was lost in some sub-groups, such as smoking avoiders. However, no significant
interaction was observed between 4-copy and selected variables on lung cancer survival.