Supplementary Methods (doc 60K)

Supplementary Methods for the analyses of the replication sample 1. Replication sample In this study, we used an independent sample to verify the findings from our primary sample. The replication sample was selected from the neuropathology collection of Stanley Medical Research Institute. The sample contained 14 schizophrenia subjects, 14 bipolar disorder subjects, and 15 healthy controls. The total RNA samples were isolated from the hippocampus and the RNA samples were prepared and quality-controlled by Stanley staff. The RNA sequencing was conducted using paired end chemistry, and standardized protocols were applied to base-calling and removal of contaminated reads. Once the bases were called, we used the same procedures and parameters as that used in our primary sample to map the reads to human genome and genes, and the expression levels of the genes were calculated as RPKM. We realized that the primary and replication samples used different brain regions, and this difference could complicate the interpretation of the replication. The main reasons for this choice were the following. A). Both the cingulate cortex and hippocampus had been implicated in both schizophrenia and bipolar disorders by imaging, gene expression and proteomic studies.1,2 While this was not a direct replication, the choice was reasonable. B). The main focuses of this were genome-wide expression between schizophrenia and bipolar disorder, and the findings were biological pathways and their interaction networks, not individual genes. This higher level interactions across different pathways are more likely to be preserved across brain regions, therefore, using samples from different brain regions could verify the results if the findings are likely to be true. C). We had some practical difficulties to find a large transcriptome sequencing work using the same brain region. It would be too long before we could do the same work with an independent sample. When we were informed of the hippocampus dataset, we decided to use it. 2. Gene differential expression For the replication data set, we used RPKM as gene expression index as we did in the primary sample. Prior to differential expression analyses, we excluded those genes with low expression values. Specifically, we removed the genes with RPKM = 0 in more than 20% individuals and the genes with a median RPKM < 0.5. A total of 12,731 genes were used for subsequent differential expression analysis. Then we fitted a linear model on the expression data of SCZ, BPD, and control samples to identify DEGs. Age, sex, cumulative anti-psychotic use (square root transformed), brain pH, and postmortem interval were included in the regression analyses as covariates. The differential expression analyses did not produce a clear set of DEGs since none of the genes was statistically significant after multiple test correction (minimal q values were 0.2291 and 0.7267 for SCZ and BPD respectively). Since we identified 105 DEGs for SCZ and 153 DEGs for BPD from the primary data set, we chose the top 105 and 153 genes (ordered by increasing p-values) as DEG candidates for SCZ and BPD, respectively, for pathway and network analyses. Of these top-ranked DEG candidates, 98 SCZ genes and 144 BPD genes had a valid Entrez Gene ID (according to R package “org.Hs.eg.db” v 2.9.0). These top-ranked DEG candidates had no overlap with the DEGs found in the primary sample for both SCZ and BPD. We used the same rationale to select a set of 213 DCEG candidates (212 with valid Entrez Gene IDs) based on the absolute product of paired t-scores (from SCZ and BPD analyses, respectively). Only one gene (ETS2) overlapped with the DCEGs found in the primary sample. 3. Pathway enrichment analysis We performed the pathway enrichment analyses using the same procedures as we did with the primary sample. Specifically, we used hypergeometric test implemented in the tool WebGestalt (version 2, http://bioinfo.vanderbilt.edu/webgestalt/)3 to identify enriched pathways from the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. To avoid too many or too few genes to be considered in each pathway analysis, we only included the pathways whose sizes were between 5 and 250 genes.4 The p-values from hypergeometric tests were further adjusted by Benjamini-Hochberg method.5 For the top-ranked DEG candidates in SCZ, the regulation of actin cytoskeleton was identified, which was a direct confirmation of the same pathway found in the DEGs from the primary sample. Similarly, pathway “Metabolism of xenobiotics by cytochrome P450” was a direct confirmation for BPD differentially expressed genes. In the analyses of top-ranked DCEG candidates, the small cell lung cancer pathway was the only pathway confirmed. 4. Pathway crosstalk/interaction analysis To test if the same pathway interactions exist among the top-ranked DCEG candidates, we conducted an analysis using the 18 pathways enriched by the primary sample DCEGs and DCEG candidates selected from the replication dataset. Specifically, we applied the Character Sub- Pathway Network (CSPN) algorithm6 to sift significantly interacting pathway pairs. CSPN was designed to prioritize pathway pairs having a large number of pathway-bridging Protein-Protein Interactions (PPIs) that would be unlikely to exist in randomly permuted PPI networks. We used the human PPI data from the Protein Interaction Network Analysis (PINA) platform (September 14, 2012)7 as the reference network in this pathway crosstalk analysis. Our working PPI network included a total of 11,318 nodes (protein-coding genes) and 67,936 interactions. When running CSPN, a mode “OR” was selected, meaning that we considered all PPIs formed by the DCEGs as well as their one-step extension. In the final step of this analysis, we selected the significant pathway interaction pairs as having permutation p-values less than 0.05. We identified five significant pathway interactions. All these five interactions were the same as that from the primary sample. In particular, the interaction between axon guidance and Fc gamma R-mediated phagocytosis and the interaction between axon guidance and regulation of actin cytoskeleton are verified in the replication sample. Acknowledgement Thanks to Drs. Junfeng Xia, and Xiaojing Wang for helpful discussion and technical support; to Dr. Shao Li for providing the CSPN script. References 1. Focking M, Dicker P, English JA, Schubert KO, Dunn MJ, Cotter DR. Common proteomic changes in the hippocampus in schizophrenia and bipolar disorder and particular evidence for involvement of cornu ammonis regions 2 and 3. Arch Gen Psychiatry 2011; 68: 477-488. 2. Sheng G, Demers M, Subburaju S, Benes FM. Differences in the circuitry-based association of copy numbers and gene expression between the hippocampi of patients with schizophrenia and the hippocampi of patients with bipolar disorder. Arch Gen Psychiatry 2012; 69: 550-561. 3. Wang J, Duncan D, Shi Z, Zhang B. WEB-based GEne SeT AnaLysis Toolkit (WebGestalt): update 2013. Nucleic Acids Res 2013; 41: W77-W83. 4. Jia P, Liu Y, Zhao Z. Integrative pathway analysis of genome-wide association studies and gene expression data in prostate cancer. BMC Syst Biol 2012; 6 Suppl 3: S135. Benjamini Y, Hochberg Y. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J R Statist Soc B 1995; 57: 289-300. 6. Huang Y, Li S. Detection of characteristic sub pathway network for angiogenesis based on the comprehensive pathway network. BMC Bioinformatics 2010; 11 Suppl 1: S327. Wu J, Vallenius T, Ovaska K, Westermarck J, Makela TP, Hautaniemi S. Integrated network analysis platform for protein-protein interactions. Nat Methods 2009; 6: 75-77.

Supplementary Methods (doc 60K)

Related documents

Products

Support

Supplementary Methods (doc 60K)

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib