Gene regulatory code Alexander Kel BIOBASE GmbH Wolfenbüttel, Germany Beverly, USA Bangalor, India George Gamow Vadim Ratner + Frame-shift mutations + connectivity of the codon series + gherllojunomd-bype Alexander fasltoiw Where ? organ, tissue, cell When ? stage of development How ? cell cycle phase extracellular signals With whom? trans cis … Insulin pathway TRANSPATH® TRANSPATH Professional: MOLECULE table TRANSPATH: TNF-alpha – 1 step downstream TNF-alpha TRANSPATH: TNF-alpha – 2 step downstream TNF-alpha TRANSPATH: TNF-alpha – 3 step downstream TNF-alpha TRANSPATH: TNF-alpha – 4 step downstream TNF-alpha Example2: Growth hormone-deficient mice (Sma1) Picture of WT mouse with hetero- and homozygous Sma1 mice. Heterozygous Sma1 mice show 33% reduction of the body weight, whereas homozygous mice exhibit a 56-58% reduction in body weight. Composite module found in promoters of differentially expressed genes in liver of growth hormone-deficient mice (Sma1). 0.1040 * V$CETS1P54_02(0.949) -50- V$TCF4_Q5(0.908) 0.0751 * V$TCF1P_Q6(0.726) -50- V$STAT6_01(0.861) 0.0728 * V$SF1_Q6(0.684) -50- V$SMAD3_Q6(0.833) 0.0419 * V$ELK1_02(0.862) -50- V$GRE_C(0.842) 450 40 400 35 350 30 300 25 250 20 N o of ob s 0.0983 * V$TCF11MAFG_01(0.821) 0.0471 * V$FOXO4_01(0.961) 0.0301 * V$IPF1_Q4(0.852) 0.0410 * V$AR_01(0.851) 0.0766 * V$GR_Q6(0.971) 0.0482 * V$STAT1_02(0.995) 0.0508 * V$CEBPB_01(0.98) 0.0281 * V$STAT5A_02(0.826) 200 15 150 10 100 5 50 Sm a1 0 0 -0.1 0.0 Non-changed genes 0.1 0.2 0.3 0.4 Norm 0.5 differentially expressed genes Results of the ArrayAnalyzer™ search upstream TFs Identifying growth hormone (GH) and receptor tyrosine kinases (RTK) as potential key molecules involved in differential expression of the genes in liver of growth hormonedeficient mice (Sma1). Data Sourse Background •Mice were infected by leukemia viruses, either by neurovirulent FrCasE or by non- neurovirulent Fr75E; •Aim was to find specific changes resulting from infection of microglia cells; •Comparison of gene expression in FrCasE-infected versus Fr75E-infected microglia cells is done in the following example. A Current dataset is highlighted by black in the project tree View of loaded data set C Match output: single putative TF binding sites YES set: in this example genes upregulated 2fold and more NO set: in this example genes downregulated 2-fold and more matrices Match outputs on the project tree, with different profiles applied P-value for the calculated ratio frequency of matches for given matrices in the YES and NO sets Ratio of frequencies YES/NO Promoter model based on nerve-specific TFs Increase of Fitness function with number of iterations Composition of the promoter model Sequences of YES and NO sets are well separated by the selected promoter model Vizualization of the promoter models for particular genes E Create a subset of TFs involved in the models Subset of TFs involved in the selected promoter models on the project tree, under the corresponding models F Searching key nodes upstream of the selected TFs To create a subset of selected key nodes or of all molecules under the selected keynodes Key node analysis can be done at the fixed number of steps upstream of the selected TFs, for example we can go one step upstream, or two,...steps upstream and suggest molecules (kinases, adaptors, receptors, ligands) that could provide coordinated regulation of the selected TFs. Score of the suggested key nodes F Vizualization of the suggested key nodes Suggested key node, adaptor protein Hgs Suggested key node Hgs is a known biomarker for neurofibromatosis F Vizualization of the suggested key nodes Suggested key node, adaptor protein TRAF2 Vizualization maps can be saved on the project tree Suggested key node TRAF2 is important for the induction of apoptosis Example: human disease - Pseudoxanthoma Elasticum TNF receptor associated factor 6 disease: osteopetrosis Elastic fibers calcification Mutations in ABCC6 transporter 6 del 1 3 . A B C C 6 d e l1 5 5 3 . A B C C 6 d e l2 3 -2 9 EC 9 149 132 12 168 188 323 350 447 5 6 4 303 370 427 471 960 554 576 998 1018 534 596 940 24 11 10 7 3 451 9 23 1082 1084 25 26 27 28 8 14 15 16 22 21 20 17 18 19 11 9 6 11 9 9 34 11 7 6 3 5 33 1215 36 32 37 29 30 31 38 1 0 6 2 11 0 4 39 40 41 42 43 44 45 55 54 52 51 50 49 4 6 4 74 8 IC C 56 ABC ELA2: human elastase 2 gene Promoter evolution AP-1 Consensus: Human collagenase (-2013) TGAgTCA ******* TGAGTCA Mouse IL-2 (-143) ** ** * TGTGTAA Mouse TNF-alpha (-82) * ** TTTCTCC NFAT human TNF promoter -107 AP-1 mast cells -74 NFAT T-cells NF-kB dendritic cells VDR AP-1 C/EBP T-cells + ? Size of zip file = complexity 1400 1200 1000 800 600 400 200 0 Time „Molecular surrealism of promoters“ Fuzzy puzzle hypothesis of the multipurpose structure of the eukaryotic promoters coding multiple regulatory messages in the same DNA sequence. A,B,C and D,E,F – two sets of TF; 1,2 – two sites in DNA; BC – basal complex. A B C B C 1 2 D E B C F 1 2 Several regulatory messages could be written in the same sequence. Reading of the messages depends on the cellular context gherllojunomd-bype Alexander fasltoiw 1) gherllojunomd-bype Alexander fasltoiw 2) 3) gherllojunomd-bype Alexander fasltoiw gherllojunomd-bype Alexander fasltoiw SHMALGAUSEN Ivan Ivanovich Born on 23.04.1884. Died on 07.10.1963. Evolutional morphology. Academician of the Division of Mathematical and Natural Sciences since 01.06.1935. Evolution of mechanisms of evolution Cybernetics: Cybernetics studies organization, communication and control in complex systems by focusing on circular (feedback) mechanisms. Control or regulation is most fundamentally formulated as a reduction of variety: perturbations with high variety affect the system's internal state, which should be kept as close as possible to the goal state, and therefore exhibit a low variety. Cybernetics: LAW OF REQUISITE VARIETY For appropriate regulation the variety in the regulator must be equal to or greater than the variety in the system being regulated. Or, the greater the variety within a system, the greater its ability to reduce variety in its environment through regulation. Only variety (in the regulator) can destroy variety (in the system being regulated). The law was formulated by Ross Ashby (1962). The Growth of Structural and Functional Complexity during Evolution Fundamental evolutional limitations Error catastrophe (Eigen M., 1971; Ratner V. and Samin V., 1982) Sequence length: L 1 - replication errors Haldane‘s Dilemma (Haldane J., 1957; Crow J. and Kimura M, 1970) Fitness of population: w w max exp( 4 Ns ln p ) w max Losses due to Genetic Load Population cannot evolve quickly in many genes simultaneously because losses are not redressed by fertility. „... there has not been enough time for evolution to have occurred - not even for human evolution...“ Solution: s 0 Neutrality (Kimura M.) Stepwise breaking of the evolutional limitations in the course of progressive evolution to multicellular eukaryotic organisms M u lt ic e ll eukaryo tes Single-celled U nice ll eukaryo tes L im itation s on m u lticellu lar organ ization an d differen tiation P ro karyo tes L im itation s on du plication s G en om e len gth lim itation s F le xib ilit y o f gene e xpressio n in d iffere nt tissues, cells, stages of developm ent, under induction and so o n . Instabilit y o f ge no m es to repeats . E rror catastro phe C h ro m atin • D ec rease of b in d in g sp ecificity • F u zzy p u zzle • In d u ced fittin g • P rotein -p rotein in teraction s D ip loid ity M ultiplicity of regulatory m essa ges encrypted in regulatory sequences Three mechanisms of biopolymer evolution Gradual evolution by fixation of multiple substitutions (Protein functional centres) Edited bipolymer by fixation of a small number of substitutions (Protein folding) Evolution at once by fixation of single substitutions (Regulatory regions of eukaryotic genes) Even some messages which were not written gherllojunomd-bype Alexander fasltoiw gherllojunomd-bype Alexander fasltoiw gherllojunomd-zype Alexander fasltoiw b Examples of anti-footprint (human/chimp) (minimized FP) HMG14_chimp HMG14_human GCAGCAGCGAAGGTAGGCCTCGAAACGCGCATTGGGATGCAGCGGGGCCTTAGGCTACAC GCAGCAGCGAAGGTAAGCCTCGAAACGCGCATTGGGATGCAGCGGGGCCTTAGGCTACAC *************** ******************************************** 1 HMG14_chimp HMG14_human HMG14_chimp HMG14_human C21orf68_human C21orf68_chimp C21orf68_human C21orf68_chimp C21orf68_human C21orf68_chimp ===========>V$NFKB_C(1.00) TGCTTCTTAATGCGGGACTTTCCATTGTGATTAGCTATTTGAGCTTTCTTTATACTTTAA TGCTTCTTAATGCGGGGCTT-CCATTTTGATTAGCTATTGGAGCTTTATTTATACTTTAA **************** *** ***** ************ ******* ************ TAATTACGGTAAATAATTTTTCTAGTGGTCGAGGCAAAAATGTAATGGATATATTCATCC TAATTACGGTAAATAATTTTTCTAGTGGTCGAGGCAAAAATGTAATGGATATATTCATCC ************************************************************ 10854 9978 10914 10037 10974 10097 CCAAGATATAGTTTAAATCCATTGTTTCTTTGTTGACTTTCTGGCTTGATGCCCTGTCTA 7124 <===============V$ELK1_01(0.87) CCAAGATATAGTTTAAATCCATTGTTTCTTTGTTGACTTCCTGGCTTGATGCCCTGTCTA 7125 *************************************** ******************** <===========V$SRY_02(0.83) GTGCTGTCACTGGAGTATTGATGTCCCCACTATTATTGTGTTGCTTTATATCTCATTTCC =======>V$CREB_01(1.00) GTGCTGTCACTGGAGTATTGACGTCACCACTATTATTGTGTTGCTTTATATCTCATTTCC ********************* *** ********************************** TAGGTCTATTAGTAATTGTTTTATAAATTTGGGAGCTCCAGTGTTAGGTGCATATATGTT TAGGTCTATTAGTAATTGTTTTATAAATTTGGGAGCTCCAGTGTTAGGTGCATGTATGTT ***************************************************** ****** 7184 7185 7244 7245 AKR1B1_-106_C AKR1B1_-106_T ---------->V$CP2_01(2.767,0.504) <-----------V$EGR1_01(3.782,1.465) G A C C C T T G G G G A A G G C C G C C G C G G C A C C C CC A G C G C A A C C A A T C A G A A G G C T C C T T C G C G <---------V$CEBP_Q3(2.903,0.921) G A C C C T T G G G G A A G G C C G C C G C G G C A C C C CT A G C G C A A C C A A T C A G A A G G C T C C T T C G C G ****************************** ***************************** Diabetes mellitus, without diabetic complications CYP17A1_-34_T CYP17A1_-34_C CCTAGAGTTGCCACAGCTCTTCTACTCCACTGCTGTCTATCTTGCCTGCCGGCACCCAGC <-----------V$EGR1_01(3.279,0.962) CCTAGAGTTGCCACAGCTCTTCTACTCCACCGCTGTCTATCTTGCCTGCCGGCACCCAGC ****************************** ***************************** Polycystic ovary syndrome TCF1_-58_A TCF1_-58_C < = = = = = = = = = = = = =V $ C O U P _ 0 1 ( 6 . 3 7 3 , 2 . 1 8 2 ) ------------>V$DR1_Q3(4.842,1.447) T G A G G C C T G C A C T T T G C A G G G C T G A A G T C CA A A G T T C A G T C C C T T C G C T A A G C A C A C G G A T G A G G C C T G C A C T T T G C A G G G C T G A A G T C CC A A G T T C A G T C C C T T C G C T A A G C A C A C G G A ****************************** ***************************** Diabetes mellitus Promoter is a white square www.biobase-international.com BIOBASE explains biology Sampling of BIOBASE Customers