Examples of functional modeling. Iowa State Workshop 11 June 2009 All tools and materials from this workshop are available online at the AgBase database Educational Resources link. For continuing support and assistance please contact: agbase@cse.msstate.edu This workshop is supported by USDA CSREES grant number MISV-329140. "Today’s challenge is to realise greater knowledge and understanding from the data-rich opportunities provided by modern high-throughput genomic technology." Professor Andrew Cossins, Consortium for Post-Genome Science, Chairman. Systems Biology Workflow Nanduri & McCarthy CAB reviews, 2008 Key points Modeling is subordinate to the biological questions/hypotheses. Together the Gene Ontology and canonical genetic networks/pathways provide the central and complementary foundation for modeling functional genomics data. Annotation follows information and information changes daily: STEP 1 in analyzing functional genomics data is re-annotating your dataset. Examples of how we do functional modeling of genomics datasets. Who uses GO? http://www.ebi.ac.uk/GOA/users.html #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 #11 #12 #13 #14 #15 #16 #17 #18 #19 #20 #21 #22 #23 #24 #25 #26 #27 #28 #29 #30 #31 #32 #33 Reference A ALBU_CHICK Serum albumin precursor (Alpha-livetin) (Allergen Gal d 5) APA1_CHICK Apolipoprotein A-I precursor (Apo-AI) FIBA_CHICK Fibrinogen alpha/alpha-E chain precursor [Contains: Fibrinopep Mol_id: 1; Molecule: Ovotransferrin; Chain: Null; Synonym: Conalbumin; Hete PB2 protein [Influenza A virus (A/chicken/Taiwan/7-5/99(H6N1))] C Chain C, Crystal Structure Of Native Chicken Fibrinogen I50711 complement C3 precursor - chicken TTHY_CHICK Transthyretin precursor (Prealbumin) (TBPA) TIM2_CHICK Metalloproteinase inhibitor 2 precursor (TIMP-2) (Tissue inhibito AAA6469 MYH9_CHICK Myosin heavy chain, nonmuscle (Cellular myosin heavy chain S19188 myosin-V - chicken FIBB_CHICK Fibrinogen beta chain precursor [Contains: Fibrinopeptide B] A Chain A, Crystal Structure Of Wild Type Turkey Delta 1 Crystallin (Eye Le type I polyketide synthase AVES 2 [Streptomyces avermitilis MA-4680] Hyperion protein, 419 kD isoform [Gallus gallus] 0 vitronectin [Gallus gallus] ovirus 3] CA36_CHICK Collagen alpha 3(VI) chain precursor paired-type homeobox Atx [Gallus gallus] I beta su I51298 transforming protein sno-N - chicken TP2A_CHICK DNA topoisomerase II, alpha isozyme ITA6_CHICK Integrin alpha-6 precursor (VLA-6) glucose regulated thiol oxidoreductase protein precursor [Gallus gallus] spectrin alpha chain [Gallus gallus] rsor ATP-binding cassette transporter 1 [Gallus gallus] cone-type transducin alpha subunit [Gallus gallus] condensin complex subunit [Gallus gallus] s] hick BA2B_CHICK Bromodomain adjacent to zinc finger domain 2B (Extracellular ryanodine receptor type 3 [Gallus gallus] type I polyketide synthase AVES 4 [Streptomyces avermitilis MA-4680] structural muscle protein titin [Gallus gallus] n k breast cancer susceptibility protein [Gallus gallus] FAS_CHICK Fatty acid synthase [Includes: EC 2.3.1.38; EC 2.3.1.39; EC 2. What is the Gene Ontology? “a controlled vocabulary that can be applied to all organisms even as knowledge of gene and protein roles in cells is accumulating and changing” the de facto standard for functional annotation assign functions to gene products at different levels, depending on how much is known about a gene product is used for a diverse range of species structured to be queried at different levels, eg: find all the chicken gene products in the genome that are involved in signal transduction zoom in on all the receptor tyrosine kinases human readable GO function has a digital tag to allow computational analysis of large datasets COMPUTATIONALLY AMENABLE ENCYCLOPEDIA OF GENE FUNCTIONS AND THEIR RELATIONSHIPS Use GO for……. 1. 2. 3. 4. Determining which classes of gene products are over-represented or underrepresented. Grouping gene products. Relating a protein’s location to its function. Focusing on particular biological pathways and functions (hypothesis-testing). Membrane proteins grouped by GO BP: B-cells Stroma cell cycle/cell proliferation cell adhesion cell growth apoptosis immune response ion/proton transport cell migration cell-cell signaling function unknown development endocytosis proteolysis and peptidolysis signal transduction protein modification LOCATION DETERMINES FUNCTION GO is the “encyclopedia” of gene functions captured, coded and put into a directed acyclic graph (DAG) structure. In other words, by collecting all of the known data about gene product biological processes, molecular functions and cell locations, GO has become the master “cheat-sheet” for our total knowledge of the genetic basis of phenotype. Because every GO annotation term has a unique digital code, we can use computers to mine the GO DAGs for granular functional information. Instead of having to plough through thousands of papers at the library and make notes and then decide what the differential gene expression from your microarray experiment means as a net affect, the aim is for GO to have all the biological information captured and then retrieve it and compile it with your quantitative gene product expression data and provide a net affect. Many people use “GO Slims” which capture only high-level terms which are more often then not extremely poorly informative and not suitable for hypothesis-testing. “GO Slim” In contrast, we need to use the deep granular information rich data suitable for hypothesis-testing Shyamesh Kumar BVSc a-CD30 mab The critical time point in MD lymphomagenesis Susceptible (L72) 18 mean total lesion score 16 Genotype 14 Susceptible (L72) Resistant (L61) 12 10 Resistant ( L61) Non-MHC associated resistance and susceptibility 8 6 4 2 0 0 20 40 60 days post infection Burgess et al,Vet Pathol 38:2,2001 80 100 a-CD8 mab 2008, 57: 1253-1262. Hypothesis At the critical time point of 21 dpi, MD-resistant genotypes have a T-helper (Th)-1 microenvironment (consistent with CTL activity), but MD-susceptible genotypes have a T-reg or Th-2 microenvironment (antagonistic to CTL). Infection of chickens (L61 & L72), kill and post-mortem at 21dpi and sample tissues Whole Tissue Cryosections Laser Capture Microdissection (LCM) RNA extraction RNA extraction Duplex QPCR Whole tissue mRNA expression L6 (R) 40 – mean Ct value 25 20 * * * L7 (S) * * 15 10 5 0 mRNA Microscopic lesion mRNA expression L6 (R) 40 – mean Ct value 25 20 * L7 (S) * 15 10 * * * 5 0 IL-4 IL-12 IL-18 TGFβ mRNA GPR-83 SMAD-7 CTLA-4 CYTOKINES AND T HELPER CELL DIFFERENTIATION NAIVE CD4+ T CELL APC Th-1 T reg Th-2 L6 Whole NAIVE CD4+ T CELL APC L7 Whole T reg Smad 7 L7 Micro IL 12 IL 4 Th-1, Th-2, T-reg ? Th-2 Th-1 Inflammatory? IL 4 IFN γ IL 12 IL10 TGFβ IL 18 CTL Macrophage NK Cell Gene Ontology based hypothesis testing QPCR data Relative mRNA expression data Gene Ontology annotation Biological Process Modeling & Hypothesis testing Step I. GO-based Phenotype Scoring. Gene product Step III. Inclusion of quantitative data to the phenotype scoring table and calculation of net affect. Gene product Th1 Treg Inflammation 1.58 -1.58 0.00 0.00 0.00 0.00 -1.20 1.20 -1.20 Th1 Th2 Treg Inflammation IL-2 1 ND 1 -1 IL-2 1.58 IL-4 -1 1 1 ND IL-4 0.00 1 -1 1 IL-6 IL-6 Th2 IL-8 ND ND 1 1 IL-8 0.00 0.00 1.18 1.18 IL-10 -1 1 1 0 IL-10 0.00 0.00 0.00 0.00 IL-12 0.00 0.00 0.00 0.00 IL-13 1.51 -1.51 0.00 0.00 Step II. by quantitative IL-12Multiply 1 -1 ND ND IL-13 -1 1 ND ND data for each gene product. IL-18 1 1 1 1 IL-18 0.91 0.91 0.91 0.91 IFN-g 1 -1 1 1 IFN-g 0.00 0.00 0.00 0.00 TGF-b -1 0 1 -1 TGF-b -1.71 0.00 1.71 -1.71 CTLA-4 -1 -1 1 -1 CTLA-4 -1.89 -1.89 1.89 -1.89 GPR-83 -1 -1 1 -1 GPR-83 -1.69 -1.69 1.69 -1.69 SMAD-7 1 1 -1 1 SMAD-7 0.00 0.00 0.00 0.00 Net Effect -1.29 -5.38 10.15 -5.98 ND = No data Whole Tissue L7 (S) L6 (R) 120 100 Net Effect 80 60 40 20 0 -20 -40 Th-1 Th-2 T-reg Inflammation Microscopic lesions L6 (R) 60 5mm L7 (S) 50 Net Effect 40 30 20 10 0 - 10 - 20 Th-1 Th-2 T-reg Inflammation Phenotype L6 Resistant L6 (R) Whole lymphoma Pro T-reg Anti CTL L7 Susceptible Pro T-reg Pro Anti Th-1 Th-2 Pro CTL Anti Th-1 Pro Th-2 Anti CTL Pro CTL Translation to clinical research: Pig Global mRNA and protein expression was measured Bindu Nanduri from quadruplicate samples of control, X- and Y-treated tissue. Differentially-expressed mRNA’s and proteins identified from Affymetrix microarray data and DDF shotgun proteomics using Monte-Carlo resampling*. * Nanduri, B., P. Shah, M. Ramkumar, E. A. Allen, E. Swaitlo, S. C. Burgess*, and M. L. Lawrence*. 2008. Quantitative analysis of Streptococcus Pneumoniae TIGR4 response to in vitro iron restriction by 2-D LC ESI MS/MS. Proteomics 8, 2104-14. Using network and pathway analysis as well as Gene Ontologybased hypothesis testing, differences in specific phyisological processes between X- and Y-treated were quantified and reported as net effects. Proportional distribution of mRNA functions differentially-expressed by X- and Y-treated tissues Treatment Y Treatment X immunity (primarily innate) inflammation Wound healing Lipid metabolism response to thermal injury angiogenesis Total differentially-expressed mRNAs: 4302 Total differentially-expressed mRNAs: 1960 Net functional distribution of differentially-expressed mRNAs: X- vs. Y-Treatment X Y sensory response to pain angiogenesis response to thermal injury Lipid metabolism Wound healing classical inflammation (heat, redness, swelling, pain, loss of function) immunity (primarily innate) 35 30 25 20 15 10 Relative bias 5 0 5 Proportional distribution of protein functions differentially-expressed by X- and Y-treated tissues Treatment Y Treatment X immunity (primarily innate) inflammation Wound Healing Lipid metabolism response to Thermal Injury Angiogenesis hemorrhage Total differentially-expressed proteins: 509 Total differentially-expressed proteins: 433 Net functional distribution of differentially-expressed Proteins: X- vs. Y-Treatment hemorrhage sensory response to pain Treatment X Treatment Y angiogenesis response to thermal injury lipid metabolism Wound healing classical inflammation (heat, redness, swelling, pain, loss of function) immunity (primarily innate) 8 6 4 2 0 Relative bias 2 4 6