Supplementary Full Methods De novo mutations in sporadic cases of Childhood Onset Schizophrenia This file includes: 1. 2. 3. 4. 5. 6. 7. 8. Clinical diagnosis Exome sequencing Likelihood analysis Copy Number Variations (CNV) Supplementary Figures (1 to 4) Supplementary References Supplementary Table 1 Supplementary Table 2 Clinical diagnosis: Patients meeting DSM -IIIR/DSM-IV criteria for schizophrenia with onset of psychosis before age 13 were recruited nationally. To address the concern of false positives resulting from inclusion of language disorders, we included only patients with clear positive symptoms (delusions or hallucinations) in this study. Medical or neurological disorders, or IQ under 70 were exclusionary criteria. Patients and their available first-degree relatives were interviewed for lifetime and current psychiatric disorders using structured psychiatric interviews and Autism Symptom Questionnaire 1,2. Diagnosis was confirmed with inpatient medication-free observation. A total of 361 patients were screened and among that, only 20 patients were selected for the current study whose parents are not affected. This study was approved by the Institutional Review Board of The National Institute of Mental Health. All participants provided written assent/consent with written informed consent from a parent or legal guardian for minors. Exome sequencing: The genomic DNA of 60 samples (20 COS trios) was prepared using a classical mammalian DNA purification method. Each trio is formed of a COS proband and his/her two parents in whom no psychiatric disorders has been observed. The DNA of probands was extracted from blood and DNA of parents was prepared from lymphoblastoid cell lines. The probands were not from a single ethnic group and belonged to various groups (Hispanic, African American, Indian, and Arabic). We prepared samples for exome sequencing in three different batches. The exome capture of every individual was performed using Sure Select XT human all exome V4 kit (Agilent Technologies Inc.). The first batch of 42 samples was captured and sequenced using an Illumina Hiseq2000 platform at the McGill Genome Centre, Montreal, Canada. The last batch of 18 samples was captured using Sure Select XT2 and were sequenced using the Illumina Hiseq2000 platform at the pharmacogenomics Centre of Montreal heart institute at the Montreal Heart Institute (Montreal, Canada). The processing and storage of data needs high computing facility to perform the analysis and the access to such a requisite is possible through RQCHP-Calcul Quebec —a local supercomputing facility. The sequenced reads from Illumina Hiseq2000 were mapped to reference genome using Burrow Wheeler’s Algorithm3. The aligned reads were converted to binary format for the convenience of further analysis using SAM tools4. The variant calling was performed using Genome Analysis Tool Kit5. This process identified single nucleotide variants and small insertions or deletions at different levels of stringency according to their quality scores. The variants identified were annotated with ANNOVAR tool to state the position of genes and their chromosomes. Likelihood analysis A likelihood analysis involving PolyPhen-2 scores of all the variants from EVS and RVIS percentiles of all genes was performed using a simulation. We wrote a small program to randomly select 20 variants from EVS and repeated the selection for 1000 times. The mean PolyPhen-2 score of every selection of 20 variants was calculated. Next, we distributed all the mean scores from the simulation (Supplementary Figure 2.a) and acquired a normal distribution with bins ranging from 0.1 to 0.9 (PolyPhen-2 scores). We then inferred the probability of each bin of PolyPhen-2 score based on the frequency. In case of RVIS we randomly selected 20 genes from the list of all genes reported in the RVIS percentile in the RVIS percentile study6, applying the simulation program mentioned earlier. We followed the same method, using the simulated mean RVIS percentiles, to infer their probability of occurrence (Supplementary Figure 2. b). We calculated the P-value using the permutation test. Copy Number Variations (CNV): Our collaborator, Dr Judith Rapoport’s group has previously examined CNVs in 126 probands affected by COS, which also includes the probands of this study7. Ahn et al showed the disease related CNVs that were significantly different in COS compared to their controls. CNVs identified in the 17 probands of the current study are not disease related except for one in proband COS885 (Supplementary Table 1). The proband COS885 has two disease related CNVs, and one of these two (10q22.3) is significantly enriched in cases by comparison to controls7. The other one, 1q21.3 was not significantly associated with COS, although we observed a de novo missense variant (GPR153:NM_207370.1:c.217C>T) in the same proband. The de novo mutation rate for 16 probands excluding this COS885 with CNV is 1.83x10-8 per base pair. Supplementary Figure 1. Capture efficiency of target regions from exome capture kits a) SureSelect XT b) SureSelect XT2 Target efficiency is calculated as the percentage of reads mapped onto the target exon regions out of the number of reads that are uniquely mapped to the whole genome reference. Supplementary Figure 2. Mean PolyPhen-2 score for a random set of 15 mutations from EVS is selected. This random selection was simulated for about 1000 times along with the mean score of de novo missense mutation. Similarly mean RVIS percentile score of 20 mutations were selected in random from the list of genes provided from a published study6. The simulation plots are shown in Supplementary Figure 2. Random simulation of mutations from EVS and their distribution of prediction scores a) PolyPhen-2 b) RVIS. Supplementary Figure 3. Violin plot showing the distribution of GERP score for de novo and private inherited variants in our study for prediction of conservation. There is no significant difference between the groups. Supplementary Figure 4. COS trio having the RYR2 variant with Sanger sequencing result showing two variants in the same codon. Supplementary References 1. Kumra S, Frazier JA, Jacobsen LK et al: Childhood-onset schizophrenia. A double-blind clozapinehaloperidol comparison. Archives of general psychiatry 1996; 53: 1090-1097. 2. Shaw P, Sporn A, Gogtay N et al: Childhood-onset schizophrenia: A double-blind, randomized clozapine-olanzapine comparison. Archives of general psychiatry 2006; 63: 721-730. 3. Li H, Durbin R: Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 2010; 26: 589-595. 4. Li H, Handsaker B, Wysoker A et al: The Sequence Alignment/Map format and SAMtools. Bioinformatics 2009; 25: 2078-2079. 5. McKenna A, Hanna M, Banks E et al: The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 2010; 20: 1297-1303. 6. Petrovski S, Wang Q, Heinzen EL, Allen AS, Goldstein DB: Genic intolerance to functional variation and the interpretation of personal genomes. PLoS genetics 2013; 9: e1003709. 7. Ahn K, Gotay N, Andersen TM et al: High rate of disease-related copy number variations in childhood onset schizophrenia. Molecular psychiatry 2014; 19: 568-572. 8. Gilissen C, Hehir-Kwa JY, Thung DT et al: Genome sequencing identifies major causes of severe intellectual disability. Nature 2014; 511: 344-347. 9. Fromer M, Pocklington AJ, Kavanagh DH et al: De novo mutations in schizophrenia implicate synaptic networks. Nature 2014; 506: 179-184. 10. Gulsuner S, Walsh T, Watts AC et al: Spatial and temporal mapping of de novo mutations in schizophrenia to a fetal prefrontal cortical network. Cell 2013; 154: 518-529. 11. Neale BM, Kou Y, Liu L et al: Patterns and rates of exonic de novo mutations in autism spectrum disorders. Nature 2012; 485: 242-245. 12. Sanders SJ, Murtha MT, Gupta AR et al: De novo mutations revealed by whole-exome sequencing are strongly associated with autism. Nature 2012; 485: 237-241. Supplementary Table.1 Clinical information of the COS probands and their comorbidities as diagnosed by the NIMH Age of onse t Gender Race Rare CNV COS451 COS483 COS630 COS691 COS755 COS885* 8 11 11 9.5 11 12 Male Male Female Female Male Male white white hispanic hispanic white others no yes no yes yes yes COS1012 8 Male white yes COS1141 10 Male white no COS1251 COS1553 COS1677 COS1785 10 12 10 10 Male Female Female Female white white white African American yes no yes no COS1801 8 Male others no COS1814 COS1855 COS1870 7 12 9 Male Female Male white others white no no yes COS2720 13 Male others no Proband ID Chr CNV 2q31.2-31.3 deletion 16q23.3 8p22, 10p11.23 1q21.3 10q22.3 Yq11.221 duplication duplication duplication deletion duplication 18q22.1 duplication 5p12.3 duplication 6q22.31 * Proband with disease related CNV Ahn et al 7. deletion Disease related CNV Comorbidity1 Comorbidity2 Comorbidity3 yes Generalized Anxiety Disorder Generalized Anxiety Disorder Asperger's Disorder AttentionDeficit/Hyperactivity Disorder Pervasive Developmental Disorder NOS Asperger's Disorder Asperger's Disorder Obsessive-Compulsive Disorder Mathematics Disorder Separation Anxiety Disorder Expressive Language Disorder Asperger's Disorder Supplementary Table 2. De novo mutation rates per exome in recent studies Studies Subjects No of cases No of mutations Rate per exome % Target covered Depth of coverage Current study COS 17 20 1.17 92 10X Gillisen8 et al. Intellectual disability 79 84 1.68 94 40X Fromer9 et al. Schizophrenia 617 637 1.03 93 10X Gulsuner10 et al. Schizophrenia 105 103 0.98 93 10X Neale11 et al. Autism 175 167 0.95 90 10X Sanders12 et al. Autism 238 167 0.70 83 8X