LIST OF SUPPLEMENTARY MATERIAL TABLES AND FIGURES Table S1: Summary of sample-level QC for each genotyping platform. Cells indicate the number of samples removed at each QC step. *Includes known duplicate pairs intended for cross-platform concordance checks. †Includes unexpected duplicates and additional affected siblings. ‡Includes 43 control samples recruited for TS GWAS. Table S2: Strongest Associated GWAS Variants (p<10-3) in EU, AJ, SA, Trio, Combined Case-Control and Combined Trio-Case-Control Samples Single nucleotide polymorphisms (SNP) listed by rs# include those with association Pvalues<10-3 for EU, AJ and SA case-control subgroups individually and combined, trios, and triocase-controls. The chromosome (Chr) and base pair location for each SNP are listed in columns to the right of the SNP column. OR indicates the odds ratio for the tested allele in the trio sample. Direction indicates whether the direction of association between OCD and the A1 allele is either positive (+) or negative (-) A1 allele for individual subgroups within the combined (EU, AJ, SA, trios) samples. The left gene and right gene columns lists the closest genes in the SNP region, either being within the gene (no distance given) or right and left flanking genes (+ distance in kilobases) or downstream (- distance in kilobases). For SNPs located within genes, other functional elements in the region are as noted. QTL (eQTL) columns list genes whose expression or methylation levels (m) are associated (P-value) with the specified SNP in that row, specifically as identified previously in EU-ancestry frontal (F), parietal (P) or cerebellar (C) tissue. mQTL and F eQTL data is unavailable for X chromosome SNP. Table S3: Strongest Associated GWAS Variants within previously identified linkage regions Table S4: Strongest Associated GWAS Variants within previously identified candidate genes Table S5: Enrichment of miRNA Target Sets in the Best Supported SNPs from the OCD GWAS. Gene sets regulated by each miRNA were downloaded from TargetScan www.targetscan.org and filtered to remove genes with a <90% probability of being regulated by each miRNA set (micro-RNA Annotatiion). LD-pruned independent SNPs lying within each target gene set (Target Gene Number) at a given p-value threshold (SNP P_thrshold) from the indicated “Sample” were then for enrichment, and the number of intervals with p-values less than the threshold (# Intervals +) are noted. Only results with an empirical p-value (Empirical_P) of p<0.1 are shown, and p-values corrected for multiple testing (Corrected_P) are listed.
Table S6: Enrichment of Gene Ontology (GO) Pathway Target Sets in the Best Supported SNPs from the OCD GWAS. Enrichment of the best supported SNPs in each GO pathway target set (Target_gene set ID and Gene Set Annotation) was tested using INRICH for the Case-Control, Trio or Trio-Case-Control analyses. Each GO target gene set contains the indicated “Target Gene Number” and within this set the number of intervals containing SNPs (# Intervals +) that are below the p-value threshold (SNP P_threshold) are shown. Only results with an empirical p-value (Empirical_P) of p<0.05 or p-values after correction for multiple testing (Corrected_P) of p<1 are shown. Table S7: Detailed Association Results of SNPs in the 14.6 kb region Surrounding rs6131295 in the Trio Analysis. Location (CHR, BP (hg19)), minor allele (A1), minor allele frequency (freq (A1)), Odds Ratio (OR), p-value (P), type of SNP (0=imputed), and r2 and D’ to 1 rs6131295 are listed. Figure S1: Quality Control Pipeline Figure S2: Multi-dimensional scaling (MDS) plot of all OCD GWAS case-control samples Figure S3: Multi-dimensional scaling (MDS) plot of OCD GWAS case-control samples of European ancestry Figure S4: Multi-dimensional scaling (MDS) plot of European ancestry OCD GWAS identifying South African Subsample Plots of additional MDS dimensions (here the 2nd and 5th dimensions) demonstrated a separation of the South African (SA) case-control sample (green) from the other European ancestry samples. Figure S5: Schematic of differential SNP missingness tests for cross-platform comparisons. 9961 SNPs were removed based on differential missingness with respect to phenotype (i.e. between cases and controls). An additional 4960 SNPs were removed due to differential missingness with respect to flanking SNP genotypes. Figure S6: MDS plot of OCD trio founders. Figure S7: Quantile-quantile (QQ) Plots of Observed versus Expected p-values in: (a) EU, (b) AJ, and (c) SA Samples. The 95% confidence interval of expected values is indicated in grey. Corresponding genomic control lambda values are indicated within each plot (lambda=1.009 for EU, 0.982 for AJ, and 0.969 for SA). Figure S8: Quantile-quantile (QQ) Plots of Observed versus Expected p-values among SNPs from 22 Candidate Genes for: (a) Case-Control samples and (b) Combined TrioCase-Control Samples. The 95% confidence interval of expected values is indicated in grey. Corresponding genomic control lambda values are indicated within each plot (lambda=1.085 for Case-Controls and lambda=1.168 for trio-case-controls). Figure S9: Regional results plot of top hits in meta-analyses. a) LocusZoom plot of rs26728 from the case-control meta-analysis; b) LocusZoom plot of rs4868342 from the case-control meta-analysis; c) LocusZoom plot of rs297941 from the triocase-control meta-analysis. Figure S10: Cluster and LocusZoom plots of rs6131295, the top SNP in family based TDT analysis. a) Normalized intensity plot of SNP genotype clusters from BeadStudio (Illumina, San Diego, CA, USA); b) Regional results plot of rs6131295, which is 90kb 3’ to BTBD3. Figure S11: LocusZoom Plot of Directly Genotyped and Imputed SNPs near rs6131295. Locations and observed (-log (p-values) for genotyped SNPs are show as circles, imputed SNPs as diamonds. Red, orange, green and blue colors indicate the r2 (derived from 1000 Genomes CEU data) between each plotted SNP and the top SNP in the region (rs6131295, in purple). Blue lines indicate the estimated recombination rate from HapMap release 22. 2 Figure S12: Interrelationships between strongest GWAS findings in Trio and Trio-CaseControl Meta-Analysis. rs6131295 is an eQTL for BTBD3 and ISM1in cis and DHRS11, in trans (indicated by blue arrows). Co-expression of DHRS11 and FAIM2 (green arrows) and ISM1 and ADCY8 (orange arrows) in approximately 16 regions from each of 40 brains across the human lifespan (9 wk post-conception to 40 years) was found by examination of the BrainSpan project (BrainSpan.org, access data 10/2011). “r” indicates the correlation coefficient between each pair of genes and “Rank” refers to the rank order of correlations of the 22,238 genes examined. SUPPLEMENTARY MATERIALS Abbreviations: AJ, Ashkenazi-Jewish European-derived samples; EU, European-ancestry, non-isolate samples collected from the US, Canada and Europe; SA, South African samples collected from Capetown, South Africa. SUBJECTS Case and trio subjects were recruited and assessed as described in the main text. Cases and trios were recruited predominantly from OCD specialty clinics, and controls were recruited from Bonn, Germany and from Capetown, South Africa. For study inclusion, all cases and trio probands were required to have a DSM-IV diagnosis of OCD. The controls from Bonn had an absent lifetime history of all axis I disorders and the South African controls were diagnostically unscreened. Additional, unscreened controls, genotyped on two different Illumina SNP arrays, came from: 1) the Study of Addiction: Genes and Environment (SAGE) cohort (1,288 individuals, Illumina Hap1M)1-3; 2) the HYPERGENES Consortium Milan, Italy (501 individuals, Illumina Hap1M); 3) the Illumina ‘iControl’ Genotype Control Database (3,212 individuals, Illumina Hap550k_v1); and 4) a cohort of Dutch ancestry (653 individuals, Illumina Hap550k_v1)(Table S1).4 GENOTYPING As described in the main text, 1817 OCD cases, 504 controls, and 663 OCD trios (2041 samples, including 1326 parents, 663 probands, and 52 affected siblings) were genotyped on 3 the Illumina Human610-Quadv1_B SNP array (Illumina, San Diego, CA, USA) at the Broad Institute of Harvard and MIT Center for Genotyping and Analysis (CGA) (Cambridge, MA, USA) in two batches (Sept-Nov 2008 and Dec 2008-Feb 2009). The method of genotype calling for the OCD samples is the same as the TS samples, and was described in the accompanying TS GWAS paper.5 1586 OCD cases, 448 controls, and 1739 OCD trio samples were successfully genotyped at CGA with call rate > 97%. 43 additional European descent control samples recruited for TS GWAS were included in the OCD GWAS as control samples, bringing the genotyped control sample total to 491. Previously genotyped control datasets were also included in the OCD GWAS, including SAGE control samples (N=1288), iControls (n=3212), Dutch controls (n=653) and Italian controls (n=501). The first three of these datasets are described in the accompanying TS GWAS manuscript. The latter control dataset was genotyped on Illumina Hap1M, consisting of 501 Italian controls from the HYPERGENES consortium collected in Milan, Italy and characterized as normotensive, non-obese, and non-dyslipidemic with no abnormal findings on physical examination, but with no formal assessment for neurologic or psychiatric conditions. 2781 SAGE samples were utilized in the initial platform-specific SNP QC steps to increase the power of detecting low quality SNPs and samples. The SAGE cases and non-European controls were then removed for further quality control steps and the final GWAS. QUALITY CONTROL PROCEDURES QC Overview A schematic of the ordered QC pipeline is provided in Supplementary Figure S1. Initial QC steps were performed in parallel within each of the five datasets. SNP genotyping concordance is checked on duplicate TS samples that were genotyped together with the OCD samples on 4 two different platforms (Hap610 and on Hap550 or Hap370) to confirm the robustness of Illumina genotyping across different platforms and sites as well as to remove SNPs with discrepant calls across platforms. Platform-specific QC includes removing SNPs and samples with low call rate (<98%), samples with ambiguous genomic sex or discordance between genomic and phenotypic sex, and strandambiguous SNPs or SNPs with allele frequency significantly different from HapMap CEU reference data. The batch effect was investigated on the samples genotyped at CGA, and no evidence for batch effect was found. Three SNPs with p<10-5 in the batch effect regression analysis were flagged, and none of these appeared in the top 580 SNPs in the case-control meta-analysis or in the top 584 SNPs in the final case-control and trio meta-analysis. Any SNPs detected with low concordance rate among different platforms based on the TS samples were removed from OCD GWAS dataset. As noted in the main text, two SNP QC thresholds were generally used for each step: a more stringent threshold at which SNPs were removed from the analysis, and a second liberalized threshold for which SNPs were flagged in an annotation file and re-examined later for potential QC-related bias. Platform merging and initial cross-platform comparisons At this stage in the QC process, all samples were merged into a single dataset using PLINK. Following the merge, 23 SNPs were either mismatched or tri-allelic and were removed. SNP allele frequencies were compared among each platform and any SNP with an absolute allele frequency difference >0.15 between two platforms were flagged. Lastly, any SNP not in common between the cleaned Hap1M, Hap610 and Hap550 platforms were removed, leaving 485,232 cleaned SNPs for subsequent analyses. 5 Removal of duplicates, related samples and individuals of non-European descent For all 7667 case-control samples remaining in the common dataset and 1654 trio samples, pairwise estimation of genome-wide identity-by-descent (IBD) was conducted with an LDpruned set of 51,516 SNPs using PLINK. 401 complete trios were confirmed with the parentproband relationship with Z1>0.9. Among the incomplete trios, 106 probands with European ancestry were included as cases in the case-control samples. One individual from each casecontrol sample pair with either a pi-hat>0.1 or Z1>0.2, representing unexpected duplicates or relatives, was removed from subsequent analyses. For the unexpected duplicates or relatives between the case-control samples and the trio samples, the case-control samples were removed from subsequent analyses. All remaining case-control samples were subjected to a multi-dimensional scaling (MDS) analysis to identify individuals of non-European ancestry (Supplementary Figure S2), and the remaining trio samples were subjected for a Mendelian error checking. The majority of samples clustered along a diagonal with samples of Dutch origin at the top left (blue) and Italian origin at the lower middle (red), consistent with the expected distribution of European ancestry samples along a Northern to Southern European cline. The samples clustered at the bottom right were later identified as Ashkenazi Jewish (AJ) samples (Supplementary Figure S2). However, 46 cases and 141 controls fell far outside this general European ancestry cluster and thus were removed from analysis due to the presence of nonEuropean genetic ancestry (Supplementary Table S1). Separation of case-control samples into genetically homogeneous sub-populations of European-ancestry derived samples: EU, SA, and AJ 6 After removing all individuals with non-European genetic ancestry, a second European ancestry MDS analysis was performed to stratify remaining samples into more homogeneous subpopulations and to re-assign individuals whose self-reported ancestry did not reflect observed genetic ancestry (Supplementary Figure S3). As expected, most case-control subjects clustered together in a homogeneous cloud along the expected Northern-Southern European cline (from the top middle to the left bottom). Within the EU cluster, the Dutch cases (light blue) and the Italian cases (pink) genotyped on Hap610 at Broad were well matched by the Dutch controls (blue) and the Italian HYPERGENES controls (yellow). The individuals within the EU cluster were separated out as a non-isolate European ancestry stratum (EU) for sub-population-specific QC and analysis. AJ Sub-population Two major clusters of individuals distinct from the main EU sample in the European ancestry MDS analysis were identified as AJ ancestry (Supplementary Figure S3). The middle red cluster represents half-AJ/half-EU ancestry. Due to the small number of samples, this “half-AJ cluster” was combined with the main AJ cluster and analyzed together as a single AJ stratum. SA Subpopulation Although the South African (SA) cases and controls (green) also fell within the general EU cluster, further MDS analyses identified additional dimensions that distinguished SA cases and controls from the EU samples, and thus they were analyzed separately as an SA-specific stratum (Supplementary Figure S4, green). Subpopulation-specific QC 7 After separating the final samples into three subpopulation-specific strata (EU, AJ, SA), an additional set of QC analyses were undertaken within each subpopulation to optimize casecontrol matching and to remove remaining poorly performing samples and SNPs (Supplementary Figure S1). First, samples were removed that demonstrated low-level relatedness (Z1>0.1) with a large number (≥20) other samples in the subpopulation. Second, samples within each sub-population were subjected to a cluster analysis (--cluster in PLINK), and any sample whose pairwise identity-by-state distance from the closest samples was > 5 standard deviations compared to the rest of samples was removed. Mean heterozygosity was calculated, and any sample with Fhet > ±0.05 was also removed from the final analysis. Following these sample QC steps, SNPs were tested for the presence of Hardy-Weinberg disequilibrium in controls from each subpopulation. Any SNPs with HWE p<10-10 were removed; those with HWE p<10-5 were flagged. SNPs were also flagged in all samples if they generated >1% Mendelian errors in the 400 OCD trios. Given the use of five different datasets across three different nested Illumina platforms, we performed an additional QC step to identify SNPs with differential missingness between cases and controls across 5 cross-platform comparisons (Supplementary Figure S5). For each of these comparisons, SNPs were removed using increasing levels of stringency with decreasing minor allele frequency thresholds. For SNPs with MAF≥0.2, SNPs were excluded with Chisquare test p<10-5 2 test, 1df). For SNPs with MAF<0.2, but ≥0.1, SNPs were excluded with p<10-4. Lastly, for SNPs with MAF<0.1, SNPs were excluded with p<10-3. In addition, a differential missingness test relative to adjacent genotypes (haplotype-based missingness test, -test-mishap in PLINK) was performed with SNPs excluded for p< 10-10. In addition, for EU cases of known Dutch ancestry genotyped on the Hap610 platform, all SNPs absent from the Dutch Hap550 control dataset were removed to reduce any differential 8 missingness specific to these matched samples. For the known Italian ancestry genotyped on the Hap610 platform, all SNPs absent from the Italian 1M control dataset were removed for the same reason. Two further rounds of MDS analyses were conducted within each ancestry-specific subpopulation. The first set of subpopulation-specific MDS analyses was used to remove any remaining samples with poor case-control matching. A final MDS analysis was then performed to identify MDS dimensions which could explain any residual population stratification. MDS dimensions were retained for subsequent association analysis if: 1) they were associated with the OCD phenotype at p<0.01 or 2) for dimensions association with the OCD phenotype at values associated with inclusion of each MDS dimension. All dimensions demonstrating a notable drop in genomic control values relative to prior MDS dimensions were retained. These MDS dimensions were included as covariates in the logistic regression model used for tests of association. Trio samples As described in the methods section, the trio samples were recruited from sites in EU-descent and non-EU descent countries including the following: Germany, France, the Netherlands, Italy, Canada, the United States, the United Arab Emirates, South Africa, Mexico, Brazil, and Costa Rica (Central Valley of Costa Rica, CVCR). For trio sample inclusion, the OCD-affected proband and both biological parents were required. On the MDS plot of the founders in the trio samples (Supplementary Figure S6), 75% of the founders were clustering on the bottom as the European ancestry samples (black). The self-reported Mexican samples (blue) show overlapping with the CVCR (green). Distinct genetic components were identified in the Brazil samples (red) from the European samples. Due to the heterogeneous nature of the trio 9 samples and the small sample size within each population stratification stratum, the quality control tests that require homogeneous population were not applied. Instead, we removed any SNPs that failed in the quality control tests in the European case-control samples. The Mendelian error test was applied in 401 complete trios. One trio with >0.1% Mendelian errors was removed, and the rest of trios were subjected for the second Mendelian error test. 2803 SNPs were found with ≥1% (≥4) Mendelian errors in the second Mendelian error test, therefore were excluded. The monomorphic SNPs in the trio samples were also removed. Post-hoc confirmation of QC analyses As a final step to confirm the quality of the QC process, we examined the square of the GWAS test statistic for any correlation with residual call rate, Hardy-Weinberg p-value and minor allele frequency of the surviving SNPs, none of which were significant (data not shown). Sex Chromosome SNPs For analysis of sex chromosome SNPs, males and females were assessed separately for each subgroup, with adjustment by MDS factors as described above, and then combined via metaanalysis, using number of cases or trios as a weighting factor. X chromosome QC QC steps for X chromosome SNPs followed the same pipeline as for autosomal SNPs (Figure S1) with a few modifications. In the first QC step, a SNP call rate threshold of 98% was used as calculated based on female samples only. Similarly, for resolution of strand-ambiguous SNPs, allele frequencies were estimated based on female samples only. Third, prior to merging samples from each platform, samples with a call rate <95% on the X chromosome were removed from analysis. After dataset merging, 1915 SNPs were removed for having 10 heterozygous genotypes in males. Finally, in the subpopulation-specific QC, a more conservative cutoff for SNPs in Hardy-Weinberg disequilibrium was used (HWE p<0.001 in female controls). Of note, no pre-defined pseudo-autosomal SNPs were genotyped on the 610Quad and thus were not available for analysis. S imilarly, since only 129 Y chromosome SNPs passed QC with a call rate>98% in males, all Y chromosome SNPs were removed from the analysis. All X-chromosome SNPs with p<1x10-3 are provided in Table S2. ANALYSES Subpopulation-specific association analysis Following QC, each of the three cleaned datasets (EU, AJ, SA) were analyzed as separate subpopulations in PLINK using logistic regression under an additive model with subpopulationspecific MDS dimensions incorporated as covariates in each analysis (EU: 4 MDS dimensions, AJ: 1 MDS dimension; SA: 2 MDS dimensions). The remaining 400 trios with 467,978 SNP were analyzed in PLINK using Transmission Disequilibrium Test (TDT). Quantile-quantile plots of each case-control subpopulation-specific analysis and in the TDT analysis of the trio samples revealed no evidence of residual population stratification or significant systematic technical artifacts (Supplementary Figures S7a-c). Meta-analysis Meta-analysis was conducted using METAL, which combined the p-values using the number of cases in each subpopulation-specific stratum for weighting.6 Two meta-analyses were conducted: a case-control meta-analysis of the three European-derived populations (EU, AJ, SA) and a final meta-analysis of three case-control populations and one trio dataset (EU, SA, AJ, Trio). 11 Using the sign test with 3616 LD-pruned SNPs with p<0.01, there was evidence for increased consistent directionality (1907/3616=0.52; p=5.25 x 10-4 for 1-sided binomial test) between the trios and the combined case-controls.” On further limiting analysis to the 414 LD-pruned SNPs with p<0.001, we found no evidence that the directionality of effect was more consistent (205/ 414=0.49; p=0.60 for 1-sided binomial test), although this loss of statistical significance can likely be attributed to decreased power provided by the smaller sample size. eQTL and mQTL enrichment tests Previously generated expression quantitative trait loci (eQTL) data from lymphoblastoid cell lines (LCL),7 frontal lobes,8 parietal lobes7 and the cerebellum7 was used to annotate the 580 SNPs with the strongest evidence of association in the trio-case-control meta-analysis (p<0.001) .7 Annotation of these SNPs was also conducted with data regarding methylation levels (methylation quantitative trait loci-mQTLs)7 within the cerebellum. Details regarding eQTL and mQTL data collection are provided in the accompanying TS GWAS paper.5 Gene and coding SNP enrichment tests The top SNPs from the trio-case-control analyses with p<0.001 and with p<0.01 were compared to 1,000 random sets of the same size, conditioning on allele frequency, to yield an empirical distribution. An enrichment p-value was then calculated as the proportion of randomized sets in which the eQTL (or mQTL) count matched or exceeded the actual observed count in the list of top SNP associations, to test whether the SNPs with the strongest observed associations were enriched for eQTLS or mQTLs.7 Enrichment of missense polymorphisms and genic SNPs was also assessed, using a similar approach to that applied for eQTL enrichment. To do this, each polymorphic SNP in HapMap was assigned a function, following the dbSNP functional classification scheme, as previously 12 described. SNP were considered “genic” if they were located either within a coding region, intron or 2 kb of upstream or downstream flanking sequences. Coding SNPs were assigned a function depending on how each allele altered the translated amino acid sequence. If either allele is nonsynonymous, it was assigned a “missense,” “nonsense,” or “frameshift” annotation. To test for enrichment of genic SNPs and specifically for missense polymorphisms, a similar approach to that applied for eQTL enrichment was used. GWAS SNPs within genomic regions with previous suggestive evidence for linkage in OCD were examined and their strength of association are summarized in Supplementary Table S3. Potential enrichment of top hits (at thresholds of p<0.001 and p<0.01) for the combined triocase-control sample from the set of SNPs from 22 previously identified candidate genes was examined by assessing for potential association with this set of SNPs using INRICH.40 The Q-Q plot of candidate gene SNPs for the combined case-control group shows little inflation (λ=1.085, Supplementary Figure S8), suggesting no evidence for over-representation within these genes. While the Q-Q plot of the combined trio-case-control sample indicates small inflation (λ=1.168, Supplementary Figure S8), the follow-up enrichment test demonstrated no over-representation of top hits (p<0.001 and p<0.01) within previously identified candidate genes (p=0.15 and p=0.10, respectively). For these 22 OCD candidate genes examined, the lowest SNP p-values are reported in Supplementary Table S4. The strongest finding was observed for ADARB222, with a p-value=1.6x10-4, which did not survive correction for multiple testing of candidate gene SNPs (corrected p=0.53). Potential enrichment of micro-RNA (miRNA) binding sites among LD-independent associated genomic intervals was also examined using a Target Scan56 probability of conserved targeting cutoff of> 0.9 and Entrez Genes hg v.18 (http://www.ncbi.nlm.nih.gov/gene) (Supplementary Table S5). Moreover, signals of enriched association with pre-defined gene pathways were also 13 queried via INRICH, providing empirical and corrected p-values for target gene sets from the GWAS results (Supplementary Table S6). SUPPLEMENTARY RESULTS Case-control meta-analysis of European ancestry derived samples (EU, AJ, and SA) Separate analyses of the MDS-identified EU, SA and AJ subpopulations were conducted to reduce the genetic heterogeneity. The case-control European-ancestry meta-analysis produced 580 loci with association p-values <1 x 10-3. No SNP shows genome wide significant evidence for association. The results of the SNPs with p-values < 1 x 10-3 are provided, along with the complete annotation information, including eQTL data from all three tissues (LCL, cerebellum, and frontal cortex) and cerebellar mQTL data (Supplementary Table S2). LocusZoom plots of loci discussed in the main text from the case-control meta-analysis, including rs26728 (within EFNA5), rs4868342 (within HMP19) and rs297941 (5’ to FAIM2), are shown in Supplementary Figure S9.9 Trio samples (Family-based TDT results) The strongest evidence for association was found on rs6131295 on 20p12, and reached the genome wide significant threshold (p = 3.84 x 10-8). The cluster plot of rs6131295 shows no evidence of artifact effect, suggesting that the association signal on rs6131295 is not likely due to genotyping artifacts (Supplementary Figure S10a). Trio-case-control meta-analysis of all OCD samples (EU, AJ, SA, Trios) The global meta-analysis of all subpopulations consisting of 1465 OCD cases and 5557 controls and 400 trios produced 584 loci with association p-values <10-3 (complete annotated list provided in Supplementary Table S2). The regional results plot of rs297941 for the global meta-analysis is shown in Supplementary Figure S9. 14 The brain-wide expression patterns of genes represented among the most strongly associated GWAS SNPs were examined, in addition to correlation with expression of other GWAS implicated genes. A schematic illustrating inter-correlations is found in Supplementary Figure S12. SUPPLEMENTARY REFERENCES 1. Bierut LJ, Saccone NL, Rice JP, Goate A, Foroud T, Edenberg H et al. Defining alcoholrelated phenotypes in humans. The Collaborative Study on the Genetics of Alcoholism. Alcohol Res Health 2002; 26(3): 208-213. 2. Bierut LJ, Madden PA, Breslau N, Johnson EO, Hatsukami D, Pomerleau OF et al. Novel genes identified in a high-density genome wide association study for nicotine dependence. Hum Mol Genet 2007; 16(1): 24-35. 3. Bierut LJ, Strickland JR, Thompson JR, Afful SE, Cottler LB. Drug use and dependence in cocaine dependent subjects, community-based individuals, and their siblings. Drug Alcohol Depend 2008; 95(1-2): 14-22. 4. Stefansson H, Ophoff RA, Steinberg S, Andreassen OA, Cichon S, Rujescu D et al. Common variants conferring risk of schizophrenia. Nature 2009; 460(7256): 744-747. 5. Scharf JM, Mathews CA. Copy number variation in Tourette syndrome: another case of neurodevelopmental generalist genes? Neurology 2010; 74(20): 1564-1565. 6. Willer CJ, Li Y, Abecasis GR. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 2010; 26(17): 2190-2191. 7. Nicolae DL, Gamazon E, Zhang W, Duan S, Dolan ME, Cox NJ. Trait-associated SNPs are more likely to be eQTLs: annotation to enhance discovery from GWAS. PLoS Genet 2010; 6(4): e1000888. 8. Gibbs JR, van der Brug MP, Hernandez DG, Traynor BJ, Nalls MA, Lai SL et al. Abundant quantitative trait loci exist for DNA methylation and gene expression in human brain. PLoS Genet 2010; 6(5): e1000952. 9. Pruim RJ, Welch RP, Sanna S, Teslovich TM, Chines PS, Gliedt TP et al. LocusZoom: regional visualization of genome-wide association scan results. Bioinformatics 2010; 26(18): 2336-2337. 15