Reconstruction of Gene Regulatory Networks from RNA-Seq Data Jianlin Jack Cheng Computer Science Department University of Missouri, Columbia ACM-BCB, 2014 Big Data Challenge in Genomic Era Biological Experiments DNA/RNA Sequencing Omics Data Genomics Transcriptomics Proteomics Metabolomics … Biological System Mass Spectrometry Knowledge Analysis Expression Profiles of Genes under Multiple Conditions / Time Points Gene 1 Gene 2 Gene 3 Gene 4 …. Con 1 Con 2 Con 3 Con 4 Con 5 Con 6 Con 7 Con 8 …. 10 40 35 20 100 5 60 … 30 Gene Regulatory Networks (GRN) Transcription factor (TF) regulates a gene GRN of yeast in rich medium Gene regulatory module TF1 TF2 TF3 Bar-Joseph et al., 2003 Bayesian Probabilistic Modeling • Assign genes into co-regulated modules • Construct regulatory relations of each module P(GRN | D) = argmax(P(D | GRN)* P(GRN)) Posterior GRN Likelihood Prior Gene Regulatory Network Modeling Join Zhu et al., 2013 Gene Regulatory Logic of a Gene Module as a Decision Tree High Expression Transcription factors and binary regulatory tree One Gene Module gene 1 gene 2 gene 3 …. …. gene n Low Expression Biological Conditions (Treatments) in Columns Regulatory Tree Construction • Pick a TF • Divide conditions into two subsets based expression states • Calculate probability Gaussian Mixture s p(gi ) k 1 j S k g1 g2 . gi . . gn μ1, σ1 μ2, σ2 Zhu et al., 2013 1 2 k e ( x ij k ) 2 k 2 2 Regulatory Tree Construction • Repeat at next level Gaussian Mixture s p(gi ) k 1 j S k g1 g2 . gi . . gn Zhu et al., 2013 1 2 k e ( x ij k ) 2 k 2 2 Regulatory Tree Construction Algorithm • Pick a TF • Divide conditions based on TF states • Calculate likelihood • Select TF maximizing likelihood • Repeat g1 g2 . gi . . gn Gaussian Mixture s p(gi ) k 1 j S k Zhu et al., 2013 1 2 k e ( x ij k ) 2 k 2 2 Gene Re-Assignment Regulatory Tree of a Module gi μ1 σ1 μ2 σ2 . . . . . . . . . 0.3 0.2 1.5 . . . . . . . . s p(gi ) k 1 j S k 1 2 k e ( x ij k ) 2 k 2 2 RNA-Seq Data of Soybean Nodulation • An important source of protein and oil • Nitrogen fixation enabled by soybean-rhizobia symbiotic interactions Nodule Gene Regulatory Modules of Differentially Expressed Genes One out of 10 modules A TF functioning in nodulation according to literature. NSP, whose homologous protein is a nodulation signaling in rice. Zhu et al., 2013 Application to Other Species Helix-loop-helix transcription factor 2 • • • • • Arabidopsis Drosophila Mouse Human … Soybean proteins affect TWIST2 – a novel protein related to Kidney disease? Acknowledgements Students • Deb Bhattacharya • Renzhi Cao • Jie Hou • Jilong Li • Matt Spencer • Trieu Tuan • Mingzhu Zhu Collaborators Jim Birchler, Bill Folk, Kevin Fritsche, Michael Greenlief, Zezong Gu, Mark Hannink, Trupti Joshi, Dennis Lubahn, Valeri Mossine, Alan Parrish, Frank Schmidt, Gary Stacey, Grace Sun, John Walker, Dong Xu Binding Site Analysis • MEME + TomTom to identify two binding sites: BetabetaAlphazinc, finger and Leucine Zipper • TFs in GRAS family contain proteins binding to the motifs. Function Enrichment Validation Function predicted by MULTICOM-PDCN P-value calculated by hypergeometric distribution. Some functions are related to formation of nodule organ. Zhu et al., 2013 Protein Interaction and Literature Validation I: TF-TF interactions by STRING, L: Literature Function Support Zhu et al., 2013 Computational Model Evaluation GRN of Human Prostate Cancer Under Botanical Treatments Lu et al., submitted Li et al., submitted.