Schizophrenia-associated HapICE haplotype is associated with increased NRG1 Type III expression and high nucleotide diversity Shannon Weickert C, Tiwari Y, Schofield PR, Mowry BJ and Fullerton JM Supplementary material Supplementary methods Human postmortem cohort and tissue extraction Dissection of the DLPFC from the coronal slab rostral to the corpus collosum along the middle frontal gyrus was isolated as previously described1-3. Unaffected controls were screened by telephone interviews of family members and/or police records for a history of medical and/or psychiatric problems, including alcohol abuse and elicit drug use. Any positive history of a psychiatric problem or excessive alcohol or drug use led to the exclusion of that case from the normal control group. Cases with an unclear psychiatric diagnosis, evidence of cocaine or phencyclidine (PCP) abuse by history and/or toxicology, cerebrovascular disease, autolysis, subdural hematoma, neuritic pathology or other pathological features were excluded from the cohort. Detection of nucleotide variation in NRG1 putative regulatory regions Sequence traces were visualized using PhredPhrap software4-5, and SNPs identified by manual inspection of each chromatogram. Base pair positions reported herein refer to the March 2006 human genome reference assembly (NCBI36/hg18). We report all SNPs with frequency as low as 0.7% (detected on one out of 148 chromosomes), which were compared to NCBI dbSNP (build 130) to determine their novelty. Bioinformatic predictions of the functionality each SNP allele was assessed with TFSEARCH6, using the vertebrate transcription factor binding matrix with a minimum score of 85 constituting a hit. Nucleotide diversity was assessed using methods previously described78 , using the following equation: In brief, nucleotide diversity (θ) takes into account the number of SNPs (K) identified in a genomic length (L) in a sample of (n) alleles. Haplotype reconstruction was performed using the --hap-phase function in PLINK9, and LD structure was assessed using HAPLOVIEW10. Australian control cohort All participants provided written informed consent prior to participation in this study, which was approved by the Wolston Park Hospital Institutional Ethics Committee, Wacol, Brisbane. Participants were drawn from a 12,745 square kilometre area of southeast Queensland via advertisements placed in local newspapers and community venues. Inclusion criteria were: (i) screen negative on the Diagnostic Interview of Psychosis 11; (ii) Caucasian ethnicity; (iii) adequate English proficiency in order to complete the interview; and (iv) DNA availability at the time of testing. The sample comprised 128 normal individuals: 85 men and 43 women (mean age=37.82±10.86 and 45.91±12.69 respectively). Quantification of NRG1 isoform expression Three reverse transcriptase reactions (a total of 9µg RNA from the DLPFC) using random hexamers and the SuperScript First-Strand Synthesis System (Invitrogen, Carlsbad, CA, USA) were generated from each individual, and pooled before transcript quantification. Transcripts were quantified on an ABI Prism 7900 sequence detection system with Sequence Detector Software version 2.0 (Applied Biosystems, Foster City, CA) by a relative standard curve method, using serial dilutions of pooled cDNA derived from RNA obtained from brain tissue of 6 subjects. In each experiment, the r2 value of the standard curve was more than 0.99 and no-template control assays resulted in no detectable signal. Each 20μl PCR reaction contained 6μl of cDNA, 1l of 20X primer/probe mixture and 10μl of RT-PCR Mastermix Plus (Eurogentec, Seraing, Belgium) containing Hot Goldstar DNA Polymerase, dNTPs with dUTP, uracil-N-glycosylase, passive reference, and optimized buffer components with standard PCR cycling conditions. All samples were measured in triplicate, and replicates were excluded if the coefficient of variation of the three measures was greater than 40%. The replicate to be excluded was determined by Grubbs’ outlier test. Individual samples were excluded as population outliers if the normalized quantity was greater than two standard deviations above or below their diagnostic group means. Supplementary results Type IV NRG1 mRNA expression Type IV isoform expression in the DLPFC was at the limit of detection and the data unreliable, so caution must be used in any interpretation. Type IV mRNA failed to amplify in 19 individuals, and most others required exclusion of at least one replicate due to measurement error. With data from 55 individuals, we found that females had higher Type IV mRNA by t-test (t value= -3.654(1,53), p=0.0006). On performing a 2-way ANCOVA with diagnosis and gender for NRG1 Type IV mRNA (age, RIN and brain pH as covariates), we did not detect a main effect of diagnosis (F=0.459(1,49), p= 0.50), but did find a main effect of gender (F=11.159(1,51), p=0.0016), suggesting that females have more Type IV mRNA expression compared to males, both in case of schizophrenia (p=0.017) and controls (p=0.007). With respect to clinical factors, duration of illness was positively correlated with NRG1 Type IV mRNA (r=0.507, p=0.008), suggesting an up-regulation of Type IV mRNA may occur with disease progression. Supplementary Figures Supplementary Figure 1: Position of novel variants with respect to putative functional regions in promoter and intronic regions. For each region interrogated, the position of each novel variation is indicated by the red arrow, with SNPs which were detected only in individuals with schizophrenia enclosed in a red box. The numbers of heterozygous individuals (hets) detected with each variant is shown, and any change in transcription binding prediction is indicated. The predicted regulatory potential track (light blue) and 28-way sequence conservation track (dark blue) from the UCSC database is shown, along with the known SNPs described in dbSNP130. Supplementary Figure 2: An increased novel variant load in schizophrenia cases (dark grey bars) compared to controls (light grey bars) (Fishers exact test: χ2=7.815; p=0.05; df=3). One control individual with 3 novel variants is represented in the ≥3 category, compared to 7 schizophrenic patients with 3, 5 and 6 novel variants (n=3, 3 and 1 respectively). Supplementary Figure 3: Relationship between the five HapICE SNPs and rs7014762 in the promoter IV/II region of NRG1. A) SNP rs7014762 is in high LD (D’=1.0, r2=0.913-0.241) with four of the five HapICE SNPs. The alleles represented in the original schizophrenia-associated HapICE haplotype are indicated after the underscore at the end of each SNP name. The block represented was determined via the solid spine of LD method in HAPLOVIEW 10. B) Seven haplotypes were identified using the four HapICE SNPs and rs7014762 as per the haplotype block shown in panel A. The most common haplotype, indicated with the red box, contains each of the risk alleles from the four HapICE SNPs, plus the major allele of rs7014762 (allele frequencies for rs7014762 were: A=0.2838; T=0.7162). Note that the designation of either the A or T for the minor/major alleles for rs7014762 is dependent on the strand on which the genotyping assay was designed, so caution must be used when comparing risk alleles across studies. Supplementary Figure 4: Loss of hemispheric asymmetry of EGFisoform mRNA expression in schizophrenia. Mean (±standard errors) EGFβ expression is shown for controls (light grey bars) compared to schizophrenia patients (dark grey bars) for DLPFC from the left and right hemispheres. For controls, there were 14 samples from the left and 23 from the right hemisphere. For schizophrenia cases, there were 19 samples from the left and 17 from the right hemisphere. Post-hoc LSD comparisons for mean group differences were made between control left and control right (mean: 63.47±5.36 vs 51.56±4.12 respectively) revealing a significant hemispheric asymmetry (p=0.019). This asymmetry is absent in schizophrenia cases (left: 52.69±3.06 vs right: 50.20±4.11; p=0.61). Expression in the left hemisphere was significantly lower in brains of schizophrenia patients compared to controls (mean 52.69±3.06 vs 63.47±5.36 respectively; p=0.040). Supplementary Figure 5: Protein binding assay for novel SNP 1395_1 comparing binding efficiencies of the novel variant to wild-type allele. For each lane, a 32P-labelled complementary primer heteroduplex (novel A allele probe: agctTAAAAGTTAAGTTTCATTAT; wild-type T allele probe: agctTAAAAGTTATGTTTCATTAT) was incubated with nuclear extracts from cultured HEK293 cells, or from dorsolateral prefrontal cortex tissue which was pooled from individual human brains from the Stanley Medical Research Institute cohort. A probe with an NFkB binding site (agctGGGTCTGTGAATTCCCGGGGGT) was used as a positive control with HEK293 nuclear protein. For each SNP variant, a hot reaction (H) containing only 32P-labeled probe is loaded adjacent to a cold competitor reaction (C) which contains an approximate 10 fold excess of unlabelled “cold” probe. Arrows indicate specific protein binding. With both protein extracts, the novel allele demonstrated reduced protein binding efficiency compared to the wild-type allele. Supplementary tables Diagnostic group Age at death (years) PMI (hours) pH RIN Gender Hemisphere Age of onset (years) Illness Duration (years) daily CPZ (grams) SCZ (n=37) 51.32 ± 14.13 28.45 ± 13.77 6.61 ± 0.30 7.27 ± 0.58 24M, 13F 23R, 14L 23.70 ± 13.82 27.62 ± 13.82 691.63 ± 502.20 CON (n=37) 51.13 ± 14.62 24.79 ± 10.97 6.65 ± 0.30 7.30 ± 0.57 30M, 7F 17R, 20L N/A N/A N/A Supplementary Table 1: Summary of demographic variables for post-mortem brain cohort. Average values for schizophrenia patients (SCZ) and controls (CON) are given ± standard deviations. In the SCZ group, seven individuals were diagnosed with schizoaffective disorder. Post mortem interval (PMI), acidity of DLPFC brain tissue (pH) and RNA integrity number (RIN). The numbers of individuals in the two groups are given for gender (M=male, F=female), and hemisphere (R=right, L=left). Average daily chlorpromazine equivalent neuroleptic dose (CPZ) ± standard deviations are given. Age of onset, illness duration and daily CPZ values are not applicable (N/A) in the control group. Gene probe name NRG1 Type I(Ig2) NRG1 Type II NRG1 NRG1 NRG1 Type IV Type III panNRG1 NRG1 NRG1 Type I EGFα NRG1 ACTB EGFβ - isoforms detected Ig2 and s1 domains (excluding GGF-2, HRG-β1d, HRG-β3b, HRG-γ3, SMDF) GGF-2, HRGβ-1d, HRG-β3b, HRG-γ3 HRG-β1b, HRG-β1c & HRGβ1d SMDF all isoforms (excluding SMDF, ndf43c) HRG-β3b and HRG-γ3 (excluding HRG-β1b, HRGβ1c) HRG-α, ndf43 and ndf43c SMDF, GGF, GGF2, all HRG-β isoforms beta-actin inventoried assay forward reverse probe (FAM-MGB) - GCCAATATCACC ATCGTGGAA CCTTCAGTTGAG GCTGGCATA CAAACGAGATC ATCACTG - GAATCAAACGCT ACATCTACATCCA CCTTCTCCGCAC ATTTTACAAGA CACTGGGACAA GCC GCTCCGGCAGCA GAACCTGCAGCC GATTCCT ACCACAGCCTTG CCT - - Hs00247620_m1 - - - Hs01108479_m1 Hs01103794_m1 - - - Hs00247624_m1 Hs99999903_m1 - - - GCAT Hs01103792_m1 - glyceraldehyde-3-phosphate GAPDH dehydrogenase Hs99999905_m1 UBC ubiquitin C Hs00824723_m1 TBP TATA box binding protein Hs00427620_m1 Supplementary Table 2: Taqman probes for quantification and normalization of NRG1 isoform expression. Custom designed primer and probe combinations were used to specifically target particular NRG1 isoforms previously investigated 12-13, while inventoried assays (Applied Biosystems, Foster City, CA) were used for all other isoforms. The isoforms detected by each probe as described 14-15 in are indicated. The genometric mean of four endogenous control genes was used for transcript normalization. region name upstream HapICE promoter IV/II 478B14-848 420M9-1395 promoter I promoter III region location, bp (NCBI Build 36) 3159286031593868 3161333431616575 3170811531708936 3178462231785559 3252232632525340 3262073032624137 total screened observed SNPs (ALL) novel dbSNPs SNPs observed SNPs (SZ) novel dbSNPs SNPs observed SNPs (CON) novel dbSNPs SNPs DNA length screened, bp # dbSNPs annotated 1,008 5 2 3 3 3 (1) 3 2 (0) 17.5 3,241 11 9 5 9 5 (2) 7 3 (0) 11.1 821 4 3 4 3 3 (3) 3 1 (1) 26.8 937 7 7 3 3 3 (2) 3 1 (0) 30.1 3,014 33 15 1 13 1 (1) 13 0 (0) 13.2 3,407 11 6 10 6 6 (2) 6 7 (4) 11.7 12,428 71 42 26 37 21 (11) 35 14 (5) 10.0 nucleotide diversity (θ×10-4) Supplementary Table 3: Summary of the nucleotide diversity in upstream regulatory and intronic regions of NRG1. The average minor allele frequency (MAF) of novel SNP variations was 0.044±0.038. The total number of novel SNPs observed in each group is shown, with the number of SNPs unique to that group shown in parentheses. Nucleotide diversity (θ) was assessed using the methods previously described 7-8. mRNA r (X,Y) p n 74 pan-NRG1 age -0.23 <0.05 pH 0.43 <0.001 RIN 0.29 <0.05 PMI 0.41 <0.001 73 EGFβ age -0.21 ns pH 0.55 <0.001 RIN 0.37 <0.01 PMI 0.06 ns 73 Type I age 0.08 ns pH 0.21 ns RIN 0.37 <0.01 PMI -0.11 ns 72 Type I(Ig2) age -0.12 ns pH 0.01 ns RIN -0.04 ns PMI -0.26 <0.05 70 Type II age 0.10 ns pH -0.33 <0.01 RIN -0.25 <0.05 PMI 0.05 ns 74 Type III age -0.10 ns pH 0.32 <0.01 RIN 0.39 <0.001 PMI 0.03 ns 55 Type IV age 0.28 <0.05 pH -0.25 ns RIN -0.37 <0.01 PMI -0.01 ns Supplementary Table 4: Demographic factors and NRG1 mRNA expression. Significant correlations with brain pH and RIN values at the p<0.05 level were observed for all isoforms excluding Type I, Type I(Ig2) and Type IV, which were significant for RIN only, PMI only or age and RIN respectively. Pan-NRG1 showed significant correlations at the p<0.05 level for all continuous demographic factors. P values are reported as < 0.001, <0.01, < 0.05, nor not significant (ns). NRG1 gene region upstream prom IV/II upstream prom IV/II upstream prom IV/II upstream prom IV/II upstream prom IV/II upstream prom IV/II upstream prom IV/II upstream prom IV/II intron 1 intron 1 intron 1 intron 1 intron 1 intron 1 intron 1 upstream prom I upstream prom III upstream prom III upstream prom III upstream prom III upstream prom III upstream prom III upstream prom III upstream prom III BP position (build 36) 31593076 31593103 31593216 31613752 31614472 31614502 31615552 31615555 31708375 31708615 31708702 31708937 31784714 31784774 31785235 32524132 32620979 32621144 32621357 32621907 32622150 32622768 32622867 32622943 SNP JF221XXX_4 JF221XXX_1 JF221XXX_2 JF4.3_1 JF4.2_4 YT4.2_1 YT4.2_2 YT4.2_3 JF848_2 JF848_3 YT848_1 NOR_848_4 JF1395_2 JF1395_3 YT1395_1 JF1.2_1 NOR3.3_1 JF3.3_5 JF3.3_6 JF3.3_7 NOR3.2_4 NOR3.2_2 NOR3.2_3 JF3.2_2 dbSNP name ss472054944 rs71523425* rs74506441* ss472054949 rs77626248* rs73584584* ss472054956 rs76063839* ss472054950 ss472054951 rs117129618* ss472054952 ss472054942 ss472054943 rs117469567* ss472054941 ss472054954 ss472054948 rs111526496* rs113060920* rs117532293* rs117347889* ss472054953 ss472054947 minor major MAF MAF MAF allele allele (ALL) (SCZ) (CON) TRANSFAC binding change C T 0.007 0.014 0.000 removes TATA C T 0.061 0.068 0.054 removes Hb, creates cap C T 0.068 0.095 0.041 creates CDP-CR C T 0.007 0.014 0.000 removes CdxA, reduces Oct-1 C A 0.108 0.108 0.041 creates c-ETS, GATA C T 0.034 0.054 0.014 reduces Sox-5, removes SRY C T 0.007 0.014 0.000 increases TATA A G 0.034 0.027 0.041 creates SRY, improves HNF-3 C T 0.029 0.042 0.000 removes CdxA A G 0.028 0.056 0.000 removes SRY, creates CdxA G A 0.030 0.054 0.000 no change C T 0.007 0.000 0.027 increases AP-1 A G 0.008 0.014 0.000 increases Sox-5 A G 0.028 0.054 0.000 reduces SRY, removes c-Myb A T 0.027 0.054 0.000 reduces SRY, removes CdxA T G 0.014 0.027 0.000 creates SRY A G 0.007 0.000 0.014 introduces GATA-1, removes AML-1a del AA 0.014 0.014 0.014 removes HFH-2 A G 0.020 0.014 0.027 removes AML-1a T C 0.016 0.029 0.000 creates GATA, reduces deltaE T C 0.007 0.000 0.014 creates S8 G A 0.007 0.000 0.014 no change A G 0.007 0.000 0.014 no change del CA 0.020 0.027 0.014 removes SRY, HFH-2 upstream prom III upstream prom III 32623136 32624045 JF3.1_3 JF3.1_2 ss472054946 ss472054945 del C TGA T 0.014 0.131 0.014 0.014 0.000 0.000 introduces C-/EBP no change Supplementary Table 5: Summary of novel variants (dbSNP130) identified in NRG1 re-sequenced regions. The gene region and base pair location (NCBI build 36) of each SNP identified is given, along with the minor and major alleles of the variant. The NCBI dbSNP submission numbers (ss) for each variant is shown, as are the SNPs names subsequently identified through the 1000 Genomes project (dbSNP132; August 2011 release), are indicated with an asterisk. The minor allele frequency (MAF) in all 74 individuals (ALL), the 37 cases with schizophrenia (SCZ) and the 37 controls (CON) are given. The predicted transcription factor binding changes for each SNP (TRANSFAC) are listed. SNP name SNP sequence TRANSFAC predicted change (minor allele) 420M9-1395 ATTTCCTTCTTTTTTAAGGCTCAAGAGTATTCGC[GT n]ATCACATTTTCTTTATTCATCTGTTGATG no change 478B14-848 AAGTTTTAAAAGTAGGATACAAAATTATGTCATA[CAn]TTTTACAAAAACCAAAATATATGTATG TTTACAGTGAAATACTCTTGTKTTGTGGTCGGGAAGTGGTGAGTT no change JF1.2_1 (ss472054941) JF1395_2 (ss472054942) JF1395_3 (ss472054943) rs71523425 creates SRY TTTAAGTCTGCAATACAGTATTGTTGACCATARGAACAATATTGTATAGTGGATTTCTAGCATTT increases Sox-5 TTCTTAACTGTAATTTTGTGCCCRTTGTTTAGTAACTCTAAATTTTCCTC reduces SRY, removes c-Myb ATAAGCTACTCAATTTAACTTTTYATTTTTGAATTCAAGCTTTTTT increases CdxA JF221XXX_1 (rs71523425*) ATAAGCTACTCAATTTAACTTTTYATTTTTGAATTCAAGCTTTTTT removes Hb, creates cap JF221XXX_2 (rs74506441*) JF221XXX_4 (ss472054944) JF3.1_2 (ss472054945) JF3.1_3 (ss472054946) JF3.2_2 (ss472054947) JF3.3_5 (ss472054948) JF3.3_6 (rs111526496*) JF3.3_7 (rs113060920*) JF4.2_4 CTCTGTATAACATTGGCCATTAATCYACATCAATATATGGTGAAGATATGTAA creates CDP-CR TCTTTTAAAATTCTATGGACYATATAAGCTACTCAATTTTAAGTTTTCA removes TATA AGACTGAAGCAGAGAAGAGCYGCAGAGGAAGAAAGTGAATGAGC no change CTGCAGTGTGGAGTCACCA[TGA/DEL]AAGGCTAACTCAAAAATGAAGTGGTA introduces C-/EBP AATGCTGACTGTTTTTTTCTTTTAAAA[CA/DEL]AACAGTCATTAAAACACTAGAAGAAATGCAC removes SRY, HFH-2 TGGGCCAATGAAATAAAAAAAA[AA/DEL]TTTAAGATAAATATGGACTGTATGGGATTAGTGAA removes HFH-2 GGGGTGTTCCTGGGCTTTACTGGRGTGGAGCTTAAAGGGTTAAAATGATATATCCTT removes AML-1a GAGCAATTATTCACCTTAYCTCAACCATTAAGAGCAAACATATTCAGCAG creates GATA, reduces deltaE TTCATGGGGCAGACGGATCTCAMAGGATGCCTAAGTTCAGCAGTGGATTGTTTGC creates c-ETS, GATA (rs77626248*) JF4.3_1 (ss472054949) JF848_2 (ss472054950) JF848_3 (ss472054951) NOR_3.2_4 (rs117532293*) NOR_848_4 (ss472054952) NOR3.2_2 (rs117347889*) NOR3.2_3 (ss472054953) NOR3.3_1 (ss472054954) rs10090954 ATTCATACCTTTTCTTAAGCATATGTTAATCAYATTAGAAATGCCATTCCCTTCTCGTGCAAAAG rs10096965 TTACTTTATTTCATTGTAGTAAGAAAACTGAACRTGCTACTACCTTCAATAAATTTTTAAGTCTG no change rs11785744 CACATGTCCAACTGAAGAGGAATTAGGGTTTAAYGATTTAAGAAGATATCATGAAACTATTAA introduces CdxA, Oct-1 rs11989919 CCCAATTTTGACCTAAACCAAACTATATACTCARTACACCAGCATTTCATC no change rs11998176 AATATCCCGGGAGAGGATGGATTCTTGTTTTAGWCATAGCTCTTTAAATTTGGCAGGACATGTG no change rs12707707 TCCTGACCACAGAGATGAATAATTTAAGGACAAYATCAAATTCTTGATAAATCTCATAAATGTT reduces Oct-1 rs13253310 GAAGTCCACATGTCCAACTGAAGAGGAATTAGGSTTTAATGATTTAAGAAGATATCATGAAACT increases Oct-1 rs13256117 TCTGTCTTCATGAAAGAGAYGGAGAGTTCCCATTTCTACTTA removes GATA-1 & -3 rs13256229 ATTTGTATCCACCCCCATCCCCAATCTACTGAAYCAGAAAATCTGGCGGCAGGGCCAGCAATCT removes AP-1 rs13263989 AATATTATTCTTATGTCAAGTGTGGAAAATACYAACCGAAGTCCACATGTCCAACTGAAGAGGA introduces CdxA rs13282705 TTTTTAAGCGATAAAGAATAAAGCTCTTTTCATYTTTTAACTGGAATTATTTTTTAGAAAATATT introduces CdxA rs13362886 AATTGCCATTTCATCATCTTTCCTTAAAGTCCCYTCAATATTTATCATTGCAATTTTTGTCCTGTC removes MZF1 rs17603786 TTAAGATAAAACGAGTTTAACAGATAATTTAGYCCATTCACATTTGTTGTAATTCATGATATTTG introduces S8 removes CdxA, reduces Oct-1 ATGTCCAACTGAAGAGGAATTAGGGTTTAACGATTYAAGAAGATATCATGAAACTATTAAA removes CdxA AAATTAATACTTTTGGTCATACAGGATGTCTCTRTTTTTTTGAAATACATTTTTCAA removes SRY, creates CdxA CAACTGGGTGCTTTTTGAAGAAATYATAAATTACCCTAGTTTAGCATAAACACC creates S8 CAAGACATCTAATATGAGTCATYTTGACCCAATATTTTCCTTGT increases AP-1 ATAGATTTAGAGAGAGTTTTACAGACTCCTRTTGACATAAGTGAACAAAATGGTTCCTTGGAA no change TACAGGTTCAGATGCATATTGTGTGCAGTGATRTGCAGCACAGTGCTTGGGGAAGCCTGTGGTC ATATTCTGGTTCTCATAGTCTCTCCTTGAAGTGRTATGTAATCAATAATATAATCAAATGCACCC GTTGCAGTTTTTAAAAAAGTAATCTTTGTTTAMATTATCTTAAGTTACTTGATTTAAAAAAGTTA no change introduces GATA-1, removes AML-1a introduces GATAm XFD, reduces CdxA rs17722883 CCCAGTTGAAATATACGGCACTGAATTCCCYAATTTTGACTAAACCAAACTATATACTCATACA introduces MZF1, reduces Ik-2 rs2466044 TAAAAAAAAAATTTAAGATAAATATGGACTGTRTGGGATTAGTGAAGATCAGAAATAATGTAT rs28401439 TGAGGTCACTGAATATATTTTTCACTATAAAAKAACATGAGAGAAAATATTTACCTTGAAATGC increases Ik-2 introduces Cdxa, reduces TATA, SRY rs28476555 TAAAGCTTCTATGACATACTTTCAAGAAACTGYTAGAGGCAACACGTAGAATCCCAGAGTAAA introduces v-Myb rs33978908 AGGAAGAATATTTTTGCTTTAAAAAAAAA[A/DEL]CCCAATTGTATAATTTAGAAATGATGACA no change rs34150028 TTTTTTAAGGCTCAAGAGTATTCG[CG/DEL]TGTGTGTGTGTGTGTGTGTGTGTGTGTATCACATT no change rs34178679 AACCTAGCATCTTTAAGGTTCRCTTAGCCCTTCCTGTGCACCTG introduces Nkx-2 rs34445647 CTCTTTAGAATTCTCCACCAGAGGGARGACAAGGGAAGGAGTAGGTTTCACGCGCAG rs34595725 GAAGTAGGTGTCAAGTTAC[C/DEL]TAAGATGTCCAAGAGACAGCTGAT no change removes VBP, CRE-BP, C/EBP rs36213229 GGAGCGGGCAGCGAGAGCCTCGGGTCTCCKCCTGGGTTCCCGGGTCTCCGGGGCGCTGGCC no change rs36213230 GGGGGTTCCCGGCAGCCGCGCCGCCACCCCYCGCCCGGCCAGCGCGGGAGGAAAAGGGGCT no change rs36213231 GCGCCCGGGAGCGCCGAGCCCAGGCTCCTCCYGGTGGCGTGTCCGCGCCTCGGGGTGGGGGT no change rs3802158 GATCTCCAGTTCTGTGTCTTTATTCTACYCCTCCCTGCCTGCTTTTCCACTCGGCAG no change rs3802159 GTAGACAATTCTGGATCCTTCCGTGGTGCCCSTACCCTGGTCTTTAACTTTTGTCCTTTGCAGGG no change rs3802160 GAGTATTTCACTGTAAATTAAGAGTCTAAGTTARCCACAGCTGTGGTATAACTCTGAGCAATGC rs3802161 AAAACCAAGAGCAAGTCACTTTTAAAGTGAMGCAATAGATTTTGAATATGGATTGTTCCAACTC removes C/EBP introduces AP-1, HLF, CrEB, reduces C/EBP rs3808368 TAGGGCACAATCTCCACTACTTTGAGGTATGTTWCAGCTTTAAACGGCAGGAGATAAGAATATT removes SRY rs4129812 TTCAGAAACAATATCAGCAGGTGTTTATGCTRAACTAGGGTAATTTATGATTTCTTCAAAAAGC rs4281084 TAAATGAACCAACAGGTCACCAAATGTTGAAGTRGTTTGTCATATAGTGACAGATAACTGATAC introduces SRY introduces Nkx-2, AML-1a, removes SRY rs4400337 ATTTCTTCAAAAAGCACCCAGTTGAAATATAYGGCACTGAATTCCCCAATTTTGACCTAAACCA reduces CdxA rs4433107 TTACAACATTTCCATAGAAATGGATTTTGAGCTTYTTTTTTTTAATTGGGGAAAATCTTATTT no change rs4531002 GAAGGCAGAAAGGCAACTTCTGGGTCCTAGTCYCAAGGGTAGAACTAATGGAGAATTCTTTTAT no change rs4623366 AAAATTCTATGGACTATAYAAGCTACTCAATTTTAAGT increases TATA rs55898258 ACATTTATTGAGTACTTAATATTTGACCCAAAYTGGATCAAAACTGGGGAATATAGAAACTGTC removes RORa1p rs57147288 TGTAAGCAGAATGTACARTGTATAAGACATGCATATAT no change rs57205530 CCCTGCTTGTATCTCTGCTCTTTGGCATTGCAAYTTTTCAGGTCCTTATGTGAAGAGGTAGAGTC rs62500193 GCAAAAATATTCTTCCTCTTTTCTCCATCCMTTGTTCTGGTCAGTTCCA no change introduces GATA, removes SRY, Sox-5 rs62500194 TTGTTCTGGTCAGTTCCARGGTTTTTTACAAATGCAAAAGAAATTCATTTGC no change rs7014762 AAGCGCTCCATCAGGGTATGAGTAACAGGGAWCTCCCCTTGCCAAGACACACAGGGAGTGTGA removes NF-kap, c-Rel rs7350144 TTTCCTACAAACATGCATGTTTTATCCAAARGAAATTCTGACCTCTAACCCCATTCACACTTTCC introduces c-Ets, removes SRY rs7812451 AGACAGCTGATGGGTTATGARTTAAATTTTGGGTTCTGCTTATCATT removes CdxA, Tst-1, Pbx-1 rs7817936 AAAGTTAAGTCTAGCCAAGGAAAAATGTAGTGSCACACGATTGCTTTTCTCTTACGC introduces AML-1a rs7817942 AGCCAAGGAAAAATGTAGTGGCACACGATTGCKTTTCTCTTACGCTGTCATTTAATGTGAGATC rs7823498 TTCCTGTAGGAATCCTGCTTTTAYGTTTTATCTTAAAGCCACCATAGTATCTGTAAT reduces GATA introduces CdxA, SRY, GATA-X rs7825588 CCAGCCTGCAGCTCTAGAGTGTGGGTAGAGAGCRGGGAGTGGGGGTTGGGAGAGGGGG increases p300, removes MZF1 rs73235619# TGTATAAGACATGCATATATCAATRTAAGGTAGTAATGTTTATTTTAAA rs35753505# GAGATATATGATATTTGGYAAAATAAAGATACATGGCTTCCA no change increases C/EBPb & a, reduces C/EBP rs62510682# GAAATGAAATATGTGTGCAAACAGTTCTTAKTACTGAGCTGTTTAAAGAAGGCCTACCTTTGCA removes CdxA, Oct-1 rs6994992# GCTAGAAGCACCATGCAGGGTTCAAGTGAAYGTATACTGGAGGCCAGACCTGCCCAACTATGC no change SNP8NRG433E 1006# YT1395_1 GGGCGGCGGCCGGCAACGAGGCGGCTCCCGCGRGGGCCTCGGTGTGCTACTCGTCCCCGCCCA ATTTTTTTGCTATTCTTTCATGATTAAAAGTTAWGTTTCATTATTATCAGGTTGTATATTTACATA reduces SRY, removes CdxA YT4.2_1 CAAAGGATGCCTAAGTTCAGCAGTGGATTGTTYGCAGAAATGGCCTAATTCTTCCCCTGCTTGT reduces Sox-5, removes SRY YT4.2_2 TCACTGAATATATTTTTCACTATAAAAGAACAYGAGAGAAAATATTTACCTTGAAATGCTAAAA increases TATA YT4.2_3 CTGAATATATTTTTCACTATAAAAGAACATGARAGAAAATATTTACCTTGAAATGCTAAAAATG creates SRY, improves HNF-3 YT848_1 CAATAGTGTTATTTGAAAAATATTTAAACAGARCATAACTCAGTTAATATATATGTACTAAATA no change no change Supplementary Table 6: Flanking sequences of 68 nucleotide variants identified in resequencing study. For each SNP observed, bioinformatic predictions of the effect of the minor allele on transcription factor binding is presented. SNPs represented in the HapICE risk haplotype were SNP8NRG221132 (=rs73235619), SNP8NRG221533 (=rs35753505), SNP8NRG241930 (=rs62510682), SNP8NRG243177 (=rs6994992) and SNP8NRG433E1006 (not annotated in dbSNP130), and are annotated with a hash (#). Novel SNPs with respect to dbSNP130 release are given, with their NCBI submission numbers (ss) in parentheses. Those novel SNPs which have subsequently been discovered in the 1000 Genomes project (dbSNP132) and are indicated with an asterisk. Novel SNPs which were exclusively observed in cases are in bold text. Supplementary References 1. Fung SJ, Webster MJ, Sivagnanasundaram S, Duncan C, Elashoff M, Weickert CS. Expression of interneuron markers in the dorsolateral prefrontal cortex of the developing human and in schizophrenia. Am J Psychiatry 2010; 167(12): 1479-1488. 2. Weickert CS, Sheedy D, Rothmond DA, Dedova I, Fung S, Garrick T et al. Selection of reference gene expression in a schizophrenia brain cohort. Aust N Z J Psychiatry 2010; 44(1): 59-70. 3. Fung SJ, Sivagnanasundaram S, Weickert CS. Lack of change in markers of presynaptic terminal abundance alongside subtle reductions in markers of presynaptic terminal plasticity in prefrontal cortex of schizophrenia patients. Biol Psychiatry 2011; 69(1): 7179. 4. Ewing B, Green P. Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res 1998; 8(3): 186-194. 5. Ewing B, Hillier L, Wendl MC, Green P. Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res 1998; 8(3): 175-185. 6. Heinemeyer T, Wingender E, Reuter I, Hermjakob H, Kel AE, Kel OV et al. Databases on transcriptional regulation: TRANSFAC, TRRD and COMPEL. Nucleic Acids Res 1998; 26(1): 362-367. 7. Halushka MK, Fan JB, Bentley K, Hsie L, Shen N, Weder A et al. Patterns of singlenucleotide polymorphisms in candidate genes for blood-pressure homeostasis. Nat Genet 1999; 22(3): 239-247. 8. Licinio J, Dong C, Wong ML. Novel sequence variations in the brain-derived neurotrophic factor gene and association with major depression and antidepressant treatment response. Arch Gen Psychiatry 2009; 66(5): 488-497. 9. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 2007; 81(3): 559-575. 10. Barrett JC, Fry B, Maller J, Daly MJ. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 2005; 21(2): 263-265. 11. Castle DJ, Jablensky A, McGrath JJ, Carr V, Morgan V, Waterreus A et al. The diagnostic interview for psychoses (DIP): development, reliability and applications. Psychol Med 2006; 36(1): 69-80. 12. Hashimoto R, Straub RE, Weickert CS, Hyde TM, Kleinman JE, Weinberger DR. Expression analysis of neuregulin-1 in the dorsolateral prefrontal cortex in schizophrenia. Mol Psychiatry 2004; 9(3): 299-307. 13. Law AJ, Lipska BK, Weickert CS, Hyde TM, Straub RE, Hashimoto R et al. Neuregulin 1 transcripts are differentially expressed in schizophrenia and regulated by 5' SNPs associated with the disease. Proc Natl Acad Sci U S A 2006; 103(17): 6747-6752. 14. Falls DL. Neuregulins: functions, forms, and signaling strategies. Exp Cell Res 2003; 284(1): 14-30. 15. Steinthorsdottir V, Stefansson H, Ghosh S, Birgisdottir B, Bjornsdottir S, Fasquel AC et al. Multiple novel transcription initiation sites for NRG1. Gene 2004; 342(1): 97-105.