Supplementary Methods (doc 39K)

advertisement
Supplementary methods
Sample collection
A population-based sample was obtained through population registries made available by the
local city councils. To make the population ethnically homogenous, we requested that at least
three out of the four grandparents originated from the same region as the study subject. All
responding subjects underwent clinical examination, otoscopy and completed a detailed
questionnaire on medical history and exposure to environmental risk factors. A list of all
questions and answers used in this study is available on request.
Strict exclusion criteria were applied to exclude persons having or having had a condition that
possibly leads to hearing impairment. No phenotypic inclusion criteria were used for the sample
collection. Subjects with ear diseases, possible monogenic forms of hearing impairment or other
major pathologies with a possible influence on hearing, were excluded. The main goal was to
study hearing impairment in healthy subjects and therefore persons with multiple hospitalizations
were excluded. The complete list of exclusion criteria was previously reported 1.
In subjects passing the medical exclusion criteria, audiometric thresholds were determined for air
conduction (0.25, 0.5, 1, 2, 3, 4, 6, 8 kHz) and bone conduction (0.5, 1, 2, 4 kHz) according to
current clinical standards (ISO 8253). We excluded subjects with asymmetrical hearing loss
(difference in air conduction threshold larger than 20 dB for at least 2 frequencies out of 0.5, 1
and 2 kHz). In case only one of the ears showed conductive hearing loss (air-bone gap of 15 dB
or more at 0.5, 1 and 2 kHz), and in the absence of other exclusion criteria, the other ear could be
included.
Rare variants
The association between rare variants and the phenotype was tested using the Sequence Kernel
Association Test (SKAT) 2. Gene regions were delineated based upon a HUGO gene list
containing the gene corresponding genomic co-ordinates, whereby SNPs were assigned to genes
falling within a distance of 50 kB. If a SNP was annotated to more than one gene, it was assigned
to the nearest gene. After quality control, the SKAT analysis was run on a total of 11,626,570
SNPs, assigned to a total of 13,000 genes.
Within each gene region, SKAT calculates a p-value for association between the PC scores and
the joint effect of multiple variants. Adjustment for multiple comparisons was performed using
the False Discovery Rate (FDR) method, as implemented in the R package fdrtool 3.
Gene-gene interactions
The program SIXPAC 4 was used to search for long-range interactions between SNPs. This
program identifies interacting SNPs by looking for high linkage disequilibrium (LD) between
physically distant SNPs, with a different LD between cases and controls. Since this analysis does
not allow the analysis of quantitative traits, we dichotomized the phenotypes to obtain a case and
a control set. Subjects having a PC score below the 20th percentile, and above the 80th percentile,
were labeled as cases and controls, respectively. Individuals with a PC score between these
percentiles were discarded. Using this approach, a group of 432 cases and an equal number of
controls was selected for all 3 PCs. As recommended by the authors, we ran SIXPAC with a
sensitivity of 20% to detect strong SNP-SNP LD, repeating the analysis 10 times with different
random number seeds.
Only SNPs that were typed in one of our two genotyping efforts, were included. We omitted
SNPs for which only imputed genotype information was available. After quality control, the
number of included SNPs was 629437. Since the test specifically looks for LD between distant
SNPs, pairs of SNPs that were closer than 5Mb apart were not considered.
MAGENTA
Gene Set enrichment analysis was performed using the software package MAGENTA (MetaAnalysis Gene-set Enrichment of variant Associations) 5_ENREF_23. This program first expresses
the significance of a gene as a whole gene score, based upon the individual p-values of the SNPs
within the gene, adjusting for confounders (gene size, number of SNPs within the gene, amount
of LD and number of recombination hotspots within the gene). Subsequently, gene scores are
combined at the level of gene sets, whereby the p-values of a given gene set are generated by
counting the fraction of the genes within that gene set that have a gene score above a predefined
enrichment cutoff. Here, we have used the 95th and 75th percentile of all gene scores as
enrichment cutoff.
Analysis was performed using the p-values from the GWAS on PC1, PC2 and PC3, with and
without adjustment for covariates. SNPs were mapped to genes using Genome build36 (hg18),
using gene sets from the following databases : Gene Ontology (April 2010), PANTHER (January
2010), Ingenuity (June 2008), KEGG (2010), Reactome and BioCarta. As recommended by the
authors, we have used the 95th and 75th percentile of all observed gene scores as enrichment
cutoffs to generate the p-values for the individual pathways. Correction for multiple testing was
performed using the FDR method.
GCTA – Variance components analysis
The variance component analysis using Restricted Maximum Likelihood (REML) was carried
out using all SNPs that passed the quality control. Individuals with an estimated relatedness
above 0.05 were excluded. The Genetic Relationship Matrix was estimated correcting for
incomplete LD between genotyped and causative variants, with and without a correction factor
for a difference in MAF between causative and genotyped SNPs 6. The percentage of variance
explained by the SNPs is obtained as the ratio of the genetic variance to the total variance.
References
1.
Van Eyken E, Van Laer L, Fransen E et al: KCNQ4: A gene for age-related hearing impairment?
Hum Mutat 2006; 27: 1007-1016.
2.
Wu MC, Lee S, Cai T, Li Y, Boehnke M, Lin X: Rare-Variant Association Testing for Sequencing
Data with the Sequence Kernel Association Test. Am J Hum Genet 2011; 89: 82-93.
3.
Storey JD, Tibshirani R: Statistical significance for genomewide studies. Proc Natl Acad Sci U S A
2003; 100: 9440-9445.
4.
Prabhu S, Pe'er I: Ultrafast genome-wide scan for SNP-SNP interactions in common complex
disease. Genome Res 2012; 22: 2230-2240.
5.
Segre AV, Groop L, Mootha VK, Daly MJ, Altshuler D: Common Inherited Variation in
Mitochondrial Genes Is Not Enriched for Associations with Type 2 Diabetes or Related Glycemic
Traits. Plos Genetics 2010; 6.
6.
Yang J, Benyamin B, McEvoy BP et al: Common SNPs explain a large proportion of the heritability
for human height. Nat Genet 2010; 42: 565-U131.
Download