Robustness and Tunability in Biological Networks by MASSACH USETTS INSTItUTE OFT ECHNOLOGY Shankar Mukherji 0 S.B., Physics Massachusetts Institute of Technology, 2004 2 2010 LIE RARIES S.B., Mathematics Massachusetts Institute of Technology, 2004 Submitted to the Harvard-MIT Division of Health Sciences and Technology in partial fulfillment of the requirements for the degree of Doctor of Philosophy ARCHIVES at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY May 2010 0 Massachusetts Institute of Technology 2010. I //I Signature of author: Division of Health Sciences and Technology May 12, 2010 Certified by: L;/ v Alexander van Oudenaarden, Ph.D. Professor of Physics and Biology Thesis Supervisor Accepted by: Ram Sasisekharan, Ph.D. Director, Harvard-MIT Division of Health Sciences and Technology Edward Hood Taplin Professor of Health Sciences and Technology and Biological Engineering 2 Robustness and Tunability in Biological Networks by Shankar Mukherji Submitted to the Harvard-MIT Division of Health Science and Technology on May 12, 2010 in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Biomedical Engineering Abstract Cells face a core tension between studiously preventing change in certain properties from extrinsic perturbations while allowing other properties to be tuned. One way cells have resolved this tension is to utilize systems that are both robust and tunable. Systems can achieve this through network design, which can contain submodules that are themselves either robust or tunable, or through network components that are robust over only a defined set of parameter ranges. This work examines these two categories with two specific examples described below. To explore how a simple network can be both robust and tunable, we make use of the osmosensing pathway in the budding yeast Saccharomyces cerevisiae. The pathway consists of two modules: a phosphorelay module that senses the osmotic shock signal that feeds into a mitogen-activated protein kinase (MAPK) module. Using a combination of systematic complementation experiments and computational sensitivity analysis, we show that the phosphorelay module is robust to changes in the kinetic parameters characterizing signal propagation through the module while signaling through the MAPK module can be tuned by changing the rate constants. Furthermore, we show that pathway robustness to rate constant changes has consequences for the evolvability of the osmosensing cascade. Populations of yeast cells challenged to alter the input/output relationship of the cascade saw their MAPK proteins preferentially targeted by natural selection over their phosphorelay counterparts. To explore how a simple regulatory element can be both robust and tunable, we turn our attention to gene regulation by microRNA (miRNA). MiRNAs are short regulatory RNA molecules that repress gene expression in a sequence-dependent manner. By observing the strength of miRNA-mediated repression in individual cells, we show that the strength of repression depends strongly on the relative abundance of the miRNA and its target. Below a threshold level of target message miRNA robustly silences the conversion of mRNA input into protein output, but above this threshold miRNAmediated repression generates an ultrasensitive response to mRNA input allowing the strength of repression to be tuned over a wide variety of values. Thesis Supervisor: Alexander van Oudenaarden Title: Professor of Physics and Biology 4 Table of Contents Preface ............................................................................................... 15 1 Introduction: Using synthetic circuits to uncover biological design ......... ........ 17 1.1 Summ ary ............................................................................. 17 1.2 Understanding network design with synethetic parts .......................... 18 1.3 Toward a quantitative understanding of gene expression .............. 20 1.3.1 Transcriptional regulation .................................................. 20 1.3.2 Promoter library studies ................................................... 21 1.3.3 Post-trascriptional and post-translational regulation .................. 24 1.3.4 Integrating transcriptional and post-transcriptional control ...................................................................... 25 1.4 Rewiring genetic and signaling pathways ......................................... 27 1.4.1 The challenges of rewiring pathways ................................ 27 1.4.2 Manipulating the sensors .............................................. 31 1.4.3 Manipulating sensor/transducer interactions ................ 32 1.4.4 Manipulating the intermediate transducers .......................... 34 1.4.5 Connecting pathway rewiring to evolvability ...................... 34 1.5 Synthetic feedback networks ..................................................... 1.5.1 O scillatory behaviors ...................................................... 35 36 1.5.2 Using synthetic circuits as modeling benchmarks .................. 40 1.6 The ultimate goal: spatiotemporal control ........................................ 1.6.1 Uncovering intra- and intercellular processes 41 ............... 41 1.6.2 Modeling ecological interactions ........................................ 1.7 Perspectives ......................................................................... 43 45 2 Robust yet tunable regulatory networks: the case of the yeast osmosensing patwahy ........................................................................ 51 2.1 Sum m ary ............................................................................. 51 2.2 Introduction ........................................................................... 52 2.3 HOG signaling displays varied sensitivity to ortholog substitutions ........... 54 2.3.1 Systematic complementation study ...................................... 55 2.3.2 Focusing on network architecture: Pbs2 versus Ypdl ................ 56 2.4 Computational analysis of HOG pathway ......................................... 59 2.4.1 Sensitivity analysis of a model of the HOG pathway ................ 60 2.4.2 Relating mutational robustness to local biochemistry ................. 63 2.5 Experimental evolution design .................................................... 66 2.6 Rapid adaptive evolution of yeast cells underexpressing YPD1 ................ 69 2.6.1 Restoration of basal wild-type HogI activity ....................... 70 2.6.2 Trascriptional regulation of YPD1 is not upregulated in the evolved strains ................................................... 73 2.6.3 PBS2 and SSK2 are preferentially mutated in independent evolution experiments ................................................... 74 2.6.4 Mutations in PBS2 and SSK2 are mainly responsible for the down-regulation of the hyperactive signaling and improved fitness ................................................................... . . 78 2.7 D iscussion ............................................................................ 79 2 .8 M ethods ................................................................................. 81 3 Robust yet tunable regulatory elements: the case of microRNA ............. 87 3.1 Sum m ary ............................................................................. 87 3.2 microRNA background ............................................................. 88 3.3 Two-color assay to measure regulation via microRNA ........................ 88 3.3.1 Control experiments establishing eYFP as a transcriptional reporter ................................................................... 89 3.4 microRNA mediated repression generates gene expression thresholds ...... 90 3.5 Generating thresholds without feedback ........................................... 92 3.5.1 Mathematical framework ................................................. 93 3.5.2 Tuning the dissociation constant X ...................................... 96 3.5.3 Tuning the threshold constant 0 ...................................... 97 3.6 Experimentally tuning the ultrasensitive response ............................... 3.6.1 Increasing N in the mCherry 3'-UTR ................................ 97 98 3.6.2 Calculating ratio transfer functions to measure fold repression ... 100 3.6.3 Changing [miR-20]total by transfecting mimic siRNA ............... 102 3.6.4 eYFP mRNA abundance at the threshold ............................. 105 3.7 Observing ultrasensitivity in physiological contexts ........................... 106 3.7.1 Fusing natural 3'-UTRs to mCherry ................................... 107 3.7.2 Luciferase assays in mouse embryonic stem cells ................... 108 .......................................................................... 110 3.9 M ethods ............................................................................... 111 3.8 Discussion 4 Conclusions and Perspectives ............................................................... 117 5 R eferen ces ....................................................................................... 123 8 Table of Figures Figure 1.3.1 Controlling the flow of information from DNA to protein using synthetic elem ents ............................................................ 23 Figure 1.3.2 An integrated transcriptional/post-transcriptional circuit to control gene expression in mammalian systems .................................. 26 Figure 1.4.1 Rewiring signaling pathways ................................................ 29 Figure 1.5.1 Building a robust, tunable oscillator in a living cell ..................... 38 Figure 1.6.1 Building a synthetic predator-prey system from quorum sensing components ....................................................................... 44 Figure 1.6.2 Simpson's paradox observedin an engineered set of cell-cell interactions ........................................................................ 46 Figure 2.2.1 Schematic depiction of the Saccharomyces cerevisiaeHOG pathway ...53 Figure 2.3.1 Flow chamber system to control osmotic conditions in medium ........... 55 Figure 2.3.2 Hog1 nuclear enrichment dynamics in response to a 0.4M NaCl osmotic shock ................................................................. 56 Figure 2.3.3 Maximum Hog 1 nuclear enrichment of mutant strains with orthologous YPD1 and PBS2 genes of varying degrees of "functional scores" under a O.4M NaCl hyperosmotic shoch ............................ 57 Figure 2.3.4 Maximum HogI nuclear enrichment of mutant strains with kinetically characterized YPD1 alleles under a 0.4M NaCl hyperosmotic shock ..... 58 Figure 2.4.1 Surface representations of peak Hog1 phosphorylation as a function of the rates associated with a given protein ................................ 60 Figure 2.4.2 Local logarithmic gradients calculated for peak Hog1 phosphorylation surfaces ............................................................................. 61 Figure 2.4.3 Local logarithmic gradients calculated for initial Hog1 phosphorylation 61 rate surfaces ....................................................................... Figure 2.4.4 Calculating sensitivity metrics from surface plots using modified standard deviation metric ..................................................... 63 Figure 2.5.1 Artificially hyperactivating HOG pathway by underexpressing YPD1 .... 66 Figure 2.5.2 Growth rate of ancestor strain as a function of doxycycline ............... 67 Figure 2.5.3 HogI nuclear enrichment in ancestral strain with and without doxycycline ...................................................................... 68 Figure 2.5.4 Glycerol production in ancestor strain with and without doxycycline ..... 68 Figure 2.5.5 HogI nuclear enrichment in the ancestor strain as a function of doxycycline ...................................................................... Figure 2.6.1 Time course of evolutionary dynamics ..................................... 69 70 Figure 2.6.2 Restoration of growth rate levels to ancestral state ......................... 71 Figure 2.6.3 Restoration of glycerol levels to ancestral state ............................. 71 Figure 2.6.4 Restoration of pathway dynamics to near ancestral behavior ............... 72 Figure 2.6.5 Restoration of volume recovery dynamics for majority of the evolved strains ......................................................................... . . .. 73 Figure 2.6.6 CFP data suggests that the evolved strains did not alter the properties of rtTA in order to effect their recovery from pathway hyperactivation ...... 74 Figure 2.6.7 Mutations in evolved strains are predominantly in HOG pathway genes ..76 Figure 2.6.8 Distribution of genetic changes in evolved strains across 9 experiments ...77 Figure 2.6.9 Characterizing the spectrum of mutations according to target identity ...... 77 Figure 3.3.1 A synthetic two-color reporter construct for measuring miRNA mediated gene regulation in single cells ......................................... 89 Figure 3.3.2 Control experiments used to confirm idea that eYFP can act as a faithful reporter of mCherry transcriptional activity in individual cells ..... 90 Figure 3.4.1 Arranging single cells according to eYFP expression level reveals gene expression thresholding by miRNA ............................................ 91 Figure 3.4.2 Transfer function relating eYFP to mCherry levels ............................ 92 Figure 3.5.1 Biochemistry of the miRNA-mediated gene regulatory system .............. 95 Figure 3.5.2 miR-20 expression in Tet-On HeLa cells ....................................... 95 Figure 3.5.3 Tuning the sharpness of the ultrasensitive switch by changing the rate at which miRNA bind their target mRNA, k..................................96 Figure 3.5.4 Tuning both the placement and sharpness of the ultrasensitive switch by titrating different total amounts of miRNA into the system ............... 97 Figure 3.6.1 Experimentally sharpening the ultrasensitive transition by engineering different numbers of miR-20 binding sites into the 3'-UTR of mCherry ...99 Figure 3.6.2 Dye-swap control experiment .................................................... 100 Figure 3.6.3 Calculating the fold repression due to miRNA as a function of target expression level .................................................................... 10 1 Figure 3.6.4 Bulk level measurements of miR-20 mediated repression ................... 102 Figure 3.6.5 Tuning the placement and the sharpness of the threshold by titrating the amount of miR-20 available to the gene regulatory system ............ 103 Figure 3.6.6 miR-20 sponge experiments shift ultrasensitive regime to lower eYFP levels as expected from the mathematical model ............................. 104 Figure 3.6.7 Results from simultaneous fitting of model to experimental data .......... 105 Figure 3.6.8 Estimating the mRNA abundance at the threshold ........................... 106 Figure 3.7.1 Detecting ultrasensitive transitions with natural UTR's ..................... 107 Figure 3.7.2 Dual luciferase assay system used to measure miRNA mediated repression in populations of mouse embryonic stem cells .................. 108 Figure 3.7.3 Fold repression increases as a function of miRNA abundance in mouse 109 embryonic stem cells .............................................................. Figure 3.9.1 Binning procedure used to convert joint mCherry-eYFP single cell distributions into transfer functions ............................................. 112 12 Table of Tables Table 2.6.1 Summary of sequencing depth and coverage for both the ancestral strain and 5 evolved strains sent for Illumina sequencing ...................... 75 Table 2.6.2 Single nucleotide polymorphisms detected from whole genome sequencing of evolved strains ................................................... 75 Table 2.6.3 Cataloguing the mutations in molecular detail ................................ 78 14 Preface Having spent my entire adult life at MIT, it is impossible to thank everyone who has helped shape my education as a scientist. From midnight discussions in the East Campus lounges to spirited group meetings in Building 68, MIT has been an extraordinary home for me. I would first like to thank my close collaborators in the work presented here: Margaret Ebert for her tireless efforts in working out together how microRNA mediated regulation works at the level of a single cell, Mei Lyn Ong for conducting the heroic experimental evolution aspect of the yeast osmosensing pathway study, and Qiong Yang for helping me with the complementation studies and teaching me about life in the lab in general. I would be remiss if I did not also extend my deepest thanks to the members of the van Oudenaarden lab. I cannot think of a more fun, intellectually stimulating place to spend one's graduate years. I have been very lucky to have a number of close mentors during my time at MIT. Leonid Mirny oversaw my bachelor's thesis project and encouraged me to stay at MIT for graduate school, during which he has been a constant source of good advice. The members of my thesis committee, Roy Kishony and Phil Sharp, have been incredibly helpful sounding boards for my sometimes crazy ideas. And of course there is no one I can thank more than my advisor, Alexander van Oudenaarden. Alexander's creativity, grace and good humor are all things I have had the good fortune to admire and enjoy over the past years in the lab and hope to admire and enjoy even after I have flown the coop. Lastly I would like to thank my parents, Sumantra and Lopamudra, my sisters, Aditi and Amrita, and my wife, Emily, and all my family spread all over the world for putting up with me and the demands of research life (my usual excuse for absent mindedness), but also for forcing me to think deeply about how to talk about my ideas, which in turn forced me to refine them even more. I hope you all enjoy reading this. 16 Chapter 1 Introduction: Using synthetic circuits to uncover biological design 1.1 Summary' The life of a cell is one with which many can sympathize: it must make decisions while constantly buffeted by forces both from the external environment and from within. A constant tension for the cell in the context of this struggle is whether to ignore these forces and remain robust against the perturbations or whether to tune themselves in response to these changes. An alternative to both of these possibilities, which is explored in this thesis with two very specific examples, is that the cell can use either network components or network designs that exhibit both robustness and tunability. For example, a gene expression regulatory component can reject changes in mRNA concentrations from affecting protein levels up until a critical value after which it allows the protein 1 See Mukherji, S and van Oudenaarden, A. Synthetic biology: understanding biological design from synthetic circuits. Nature Reviews Genetics 10: 859-871 (2009) level to be tuned by the mRNA level or perhaps a network can be broken into subparts some of which can be used to tune the input/output relationship of the network while others cannot. In order to test these ideas quantitatively, it is often useful to use highly engineered regulatory pathways and interactions in order to cleanly isolate the phenomena under study from other endogenous effects. Therefore, I will begin by presenting a review of the literature highlighting the use of synthetic components and networks used to understand natural biological effects. 1.2 Understanding network design with synthetic parts An important aim of synthetic biology is to uncover the design principles of natural biological systems through the rational design of gene and protein circuits. Here we highlight how the process of engineering biological systems to the control of cell-cell interactions - from synthetic promoters has contributed to our understanding of how endogenous systems are put together and function. Synthetic biological devices allow us to intuitively grasp the ranges of behavior generated by simple biological circuits, such as linear cascades and interlocking feedback loops, as well as to exert control over natural processes such as gene expression and population dynamics. One of the most astounding findings of the human genome project was that our genomes contained as many genes as that of Drosophila melanogaster. This finding begged the question: how do you get one organism to look like a fly and another like a human with the same number of genes? One possibility is that the rich repertoire of non-protein coding sequence found in the genomes of complex organisms adds many new parts with which to generate complexity [Mattick 2004]. A decade of research has put forward the rather different idea that instead of looking at the length of the parts list as the determinant of organismal complexity, we should look at how those parts fit together [Davidson 2006, Prud'homme 2007]. From this perspective, complexity arises from novel combinations of pre-existing proteins and the ability to evolve new phenotypes rests on the modularity of biological parts. While natural examples have been found to illustrate this latter possibility [Prud'homme 2007], strong evidence to support this post-genomic view of biology has come from the synthesis of new biological systems. Rational synthesis of biological systems can hint at the natural history of how a particular system came to acquire its properties [Bridgham 2006, Rapp 2007]. More often, however, we use synthetic circuits to explore, in a handson fashion, the set of design principles that determine the structure and operation of biological systems. The core aim of synthetic biology is to develop and apply engineering tools to control cellular behavior, using precisely characterized parts, such as cis-regulatory elements, to achieve desired functions. An important direction, for example, has been to engineer cells with an eye towards practical applications, such as bioremediation [Gilbert 2003], biosensors [Rajendran 2008], biofuels [Steen 2008, Waks 2009], and even the beginnings of clinical applications [Khosla 2003, Ro 2006, Anderson 2006]. In this Review, however, we focus on how synthetic circuits help us to understand how natural biological systems are genetically assembled and how they operate in organisms from microbes to mammalian cells. In this light, synthetic circuits have been critical as simplified test-beds in which to refine our ideas of how similarly structured natural networks function and have served as tools to control natural networks. We highlight the contribution of synthetic biology to putting together an increasingly quantitative description of gene expression and signal transduction, in uncovering the diversity of behaviors that can arise from positive and negative feedback systems, and progress in rationally controlling spatial organization and cell-cell interactions. We pay particular attention to recent progress in using synthetic systems to uncover novel aspects of cell biology, such as how cells decide to undergo apoptosis and the molecular basis for communication between the endoplasmic reticulum and mitochondrion. We aim to show that synthetic biological approaches have given us a great deal of intuition on how the simple building blocks that underlie complex natural systems work as well as basic tools to quantitatively characterize natural phenomena, both of which are crucial for the field to progress into the analysis and complete control of natural circuits. 1.3 Towards a quantitative understanding of gene expression The first step in assembling a biological circuit is to gather the component parts. In the cell, circuits are accomplished via gene expression, and so a great deal of effort in synthetic biology has gone into investigating the rules surrounding gene expression, particularly the processes of transcription and translation. The precise measurements afforded by artificially constructed systems allows us to transform qualitative notions of transcriptional repression and activation and post-transcriptional regulation into quantifiable effects such as how promoter architecture defines the rate of transcription and the specific degradation rate specified by a given sequence motif. 1.3.1 Transcriptional regulation. Among the earliest contributions of synthetic biology to understanding natural biological processes include detailed, quantitative measurements of transcriptional regulation, building on a foundation laid 50 years ago in the groundbreaking work of researchers such as Jacob and Monod [Jacob 1961]. Synthetic constructs have been used to map out the transfer function that relates the input concentration of transcription factor [Rosenfeld 2005, Pedraza 2005] and inducers [Setty 2003] to the output concentration of a reporter gene [Rosenfeld 2005, Ozbudak 2002, Elowitz 2002], single mRNA molecules [Golding 2005, Raj 2006] or single proteins [Cai 2006]. Many of these same constructs were also used to measure the mean output of the transcriptional process and also higher-order moments (such as the variance) in organisms ranging from Escherichia coli and Bacillius subtilis to mammalian cells. Single-molecule studies in these model organisms directly established that mRNA and proteins are produced in bursts of activity [Raj 2008]. A key question in the study of transcriptional regulation is how promoter architecture affects transcriptional activity. For example, below we describe several studies that have informed how the number and genomic positions of transcription factor (TF) binding sites affect transcriptional activity. Given the combinatorial control of gene expression, it is also critical to study how multiple TFs interact with DNA and with each other to tune mRNA production. Endogenous promoters use all of these parameters to specify either a desired transcription rate or a boolean function, such as an AND gate that allows transcription to occur only when all TF binding sites in the promoter are occupied. 1.3.2. Promoter library studies The experimental breakthrough that allowed quantitative measurements of the transcriptional power of different promoter architectures was the use of combinatorial promoter libraries [Hammer 2006], which are shown in schematic form in Figure 1.3.1. Libraries of promoters driving reporter proteins, such as luciferase or fluorescent proteins, allow for an unbiased measurement of transcriptional activity over the space of possible promoters - such an unbiased method can then be used to try and ascertain rules that describe the responsiveness of a promoter to TFs. Earlier work used randomly mutated promoters to draw inferences about the functional subparts of the promoter, such as the TATA box; by contrast, the construction of combinatorial promoter libraries involves identifying specific operator sites that bind TFs and randomly ligating them together in a way that shuffles their relative positions and copy numbers. The studies highlighted below have combined such promoter libraries and modeling to show that the strength of a promoter is determined largely by the position of TF binding sites with respect to key promoter elements such as the TATA box and with respect to each other.. The simplest case is to understand how the positioning of a single operator affects the expression of a promoter. In prokaryotes, operators are classified as being in the core, proximal, or distal regions of the promoter (Figure 1.3.1). Working in E. coli Cox et al. [Cox 2007] and Kinkhabwala and Guet [Kinkhabwala 2008] independently observed that repressors can effectively repress expression from all 3 promoter subregions, with Cox et al. showing that the strength of repression is greatest when the repressor site is in the core region of the promoter, less strong when in the proximal region, and weakest when in the distal region. Activators, on the other hand, work only in the distal site; Cox et al. showed that instead they have no effect in the core and proximal sites. Both studies go on to develop simple models of promoter activity by taking into account binding reactions of TF to DNA that are in thermodynamic equilibrium. It was expected that the situation would be far more subtle in eukaryotes, where chromatin structure can strongly influence expression levels [Segal 2009]. However, even in Saccharomyces cerevisiae 49% of the variation in expression in the promoter library could be explained by a simple thermodynamic model incorporating just TF-DNA and TF-TF interactions [Gertz 2009], interactions that were also suggested in theoretical work [Buchler 2003]. More surprisingly, Gertz et al. provided evidence that weak binding sites, which are important for prokaryotic transcription, can also be important in eukaryotes. Focusing on the TF multicopy inhibitor of GAL-] (Migi), Gertz et al. showed that repression from one weak and one strong Mig1 binding site can be as effective as two strong Migi binding sites. This is particularly crucial given that 24% of all yeast promoters contain putative weak Migl binding sites. The promoter library studies open the way to consider some general questions in transcriptional control. The theoretical frameworks in the E. coli and yeast studies, for example, differ slightly: the former studies make no use of TF-TF interactions and frame the issue mostly in the language of Boolean logic, whereas the latter makes heavy use of TF-TF interactions, particularly in the analysis of weak binding sites. Future singlemolecule studies of transcriptional control can help to resolve the relative importance of TF-DNA and TF-TF interactions in generating transcriptional activity. Furthermore, the fact that simple equilibrium binding explains much, but not all, of the effect of promoter architecture on expression level suggests that the next goal should be to track down the source of the remaining variation. Genomic location can be an important contributor to expression and expression fluctuations [Becskei 2005], perhaps by affecting local chromatin context. Knowing how to parcel out the variation due to these different effects will be particularly helpful when these studies are extended to mammalian systems, where there is considerably less control over where synthetic transgene constructs are integrated into the genome. Basple of matwal Gene mpreslon subpracess SyntiwiC bhp v Transciption inducible promoter * Stochastic gene apression "Gene regulation function Pronoter . Genomic pcsitionng of TF sites "Weak TF-DNA Interactions "TF-TF interactions p--na enaanalyzed ary Core Distal PrcKrna 00 =1 AjMff u cri I1M1uffM~-7 (l AUl fAraC * LadM *LuR Post-transcriptior translation RBS accass ty aBS RBRacssaby Aptarner Translation I inducible protease *TtR Stocastic gene exprsion Cal cycle progression Enzyme kinetics Figure 1.3.1 Controlling the flow of information from DNA to protein using synthetic elements. The diagram shows the transcriptional and post-transcriptional processes in gene expression that can be manipulated by synthetic biology tools, and some example applications. TF, transcription factor; RBS, ribosome binding site. Promoter library diagram from Cox RS 3rd, Surette MG, Elowitz MB. Programming gene expression with combinatorial promoters. Mol. Syst. Bio. 3: 145 (2007). RBS accessibility diagram from Isaacs FJ, Dwyer DJ, Ding C, Pervouchine DD, Cantor CR, Collins JJ. Engineered riboregulators enable post-transcriptional control of gene expression. Nat Biotech 22, 841-847 (2004) Aptamer diagram from Grate and Wilson Grate D, Wilson C. Inducible regulation of the S. cerevisiae cell cycle mediated by an RNA aptamer-ligand complex. Bioorg Med Chem 9: 2565-2570 (2001) 1.3.3 Post-transcriptional and post-translational regulation Although much of the early work in synthetic biology focused on transcriptional regulation significant progress has also been made in incorporating post-transcriptional effects into synthetic circuits, affecting both RNA and protein. At the RNA level, for example, mutagenesis screens based on synthetic constructs have been used to determine the sequences that are recognized by RNA editing enzymes to change adenine into inosine [Pokharel 2006]. Furthermore, as regulatory RNAs have been increasingly appreciated as important drivers of gene expression, synthetic circuits have included elements from the RNA interference pathway [Beisel 2008], aptamers [Win 2008, Werstruck 1998, Grate 2001], and riboswitches [Suess 2004, Desai 2004] to control the flow of genetic information [Davidson 2007]. Synthetic circuits involving enzymatic RNAs have mostly been developed as platforms to tune gene expression, but many of these platforms can easily be extended to understand natural biological phenomena. In the study of Grate and Wilson, for example, an aptamer is used to control the expression of cyclinB-2 (Clb2), a key regulator of the cell cycle, in a tetramethylrosamine (TMR)-dependent manner [Grate 2001]. The authors slowed the speed of the cell cycle by adding TMR; this method can be useful in measuring how the level of Clb2 protein affects the speed at which the cell cycle progresses while keeping all transcriptional feedback constant. Synthetic studies have also directly tweaked how mRNA is translated into protein and how long proteins persist before being degraded. Several experiments in prokaryotic systems, especially those studying the stochastic nature of gene expression, have altered the translation rate by mutating ribosomal binding sites (RBS) [Ozbudak 2002, Issacs 2004]. Apart from demonstrating another possible layer of quantitative regulation of gene expression, studies involving RBS variants provided early evidence that E.coli cells could tune the stochasticity in the expression level of a given gene independently of its mean. Lastly, Grilly et al. have developed a circuit that controls the degradation of a target protein using the well-known ClpXP protease machinery from E.coli [Grilly 2007]. Typically, models of gene expression treat protein degradation as an exponential decay process, with the decay being due to growth of cell volume over time. Regulated proteolysis, however, can depend on the formation of enzyme-substrate complexes as intermediates on the way to degradation. In finding that the degradation follows Michaelis-Menten kinetics, Grilly et al. completed one of the few quantitative comparisons of specific protease activity to models of enzyme kinetics. Taken together, these results point to some interesting similarities between transcription and translation - both are inherently noisy processes that can be quantitatively modulated by specific sequence elements, such as RBS and protease recognition sites. Future studies can use the ideas and methods from the study of transcription, such as combinatorial library approaches, to more systematically explore the process of translation. 1.3.4 Integrating transcriptional and post-transcriptional control. The two approaches of engineering specific promoter architectures or using a natural inducible promoter to tune transcriptional activity, and using specific sequence sites to tune translational yield can combined to achieve precise yet flexible control over gene expression [Ozbudak 2002, Beisel 2008]. A nice example of using these two ingredients to study natural processes in mammalian cells can be found in recent work, outlined in Figure 1.3.2, in which a Tet and Lac controlled regulation was adapted and combined with RNAi for use in HeLa cells [Deans 2007]. ............. .... .... ...... .. ............................................... .......................... a PTG .Transcptioral -.--. Post-transcriptional b 30 -0- - Induced apoptotic cells Induced dead cells -- Nontransfected cels -s- Nontransfected cels 2.5- 2.5 0 1.000 IPTG (pM) Figure 1.3.2 An integrated transcription/translation circuit to control gene expression in mammalian systems. A) Deans et al. created a genetic switch whose state is read out by a GFP reporter or a gene of interest; here, the gene of interest that we focus on is Bax, a pro-apoptotic gene. Bax is under the transcriptional control of the Lac repressor (Lad) and under the translational control of a short hairpin (sh)RNA, which itself is under transcriptional control of the TetR repressor. In the "OFF" state, Lac inhibits transcription of Bax. Additionally, Lac inhibits transcription of the TetR repressor; this allows the transcription of the shRNA, which goes on to inhibit translation of Bax by cleaving its mRNA. The result of this dual-layered repression is the creation of a truly off "OFF" state; whereas in the initial characterization each mode of repression alone was able to reduce reporter levels by about 80%, leaving a basal expression of 20%, the combination resulted in greater than 99% repression. The circuit can then be tunably activated by adding varying amounts of IPTG, which blocks the effects of LacI. B) The fraction of cells that undergo apoptosis is determined by Bax expression levels. Data obtained by tuning Bax with IPTG, as described above, offer some tantalizing clues into the fundamental molecular biology underlying the apoptosis pathway. In particular the data are consistent with the idea that the decision to undergo apoptosis (assessed by retention of propidium iodide (PI) dye relative to PI retention due to the transfection protocol alone) is determined by reaching a threshold level of Bax. Although the Bax threshold data are not conclusive, the result demonstrates the power of a technique that allows one to rationally tune the level of any gene of interest and examine the consequences. Panel B is reproduced with permission from Figure 6b of Deans TL, Cantor CR, Collins JJ. A tunable genetic switch based on RNAi and repressor proteins for regulating gene expression in mammalian cells. Cell 130, 363-372 (2007) As synthetic biology begins to recapitulate more realistic systems, which contain many moving parts, demand will increase for circuits that control every step of the process that turns DNA sequence into protein. Such layered circuits can help illuminate why certain regulatory schemes are employed to control gene expression over others in a given context. For example, gene expression in natural systems can be attenuated by epigenetic silencing, transcriptional repressors or post-transcriptional regulators such as microRNAs (either alone or in concert with other molecules); this begs the questions of why a system uses one system rather than the other and to what extent different layers of regulation generate collective effects that no one layer can accomplish. One area that will be increasingly under study, and that may help unravel the issues surrounding layered circuits, is the dynamics of the different steps that contribute to expression; the studies highlighted above almost exclusively focus their attention on steady state behavior. While intuition tells us that transcription factors act slowly compared to post-transcriptional players such as regulatory RNAs, as the latter presumably do not have to be transported back to the nucleus and then locate a specific genomic locus, there is currently a lack of data that would enable us to turn these intuitive notions into quantitative facts. 1.4 Rewiring genetic and signaling pathways. The act of engineering cellular pathways has allowed insight into two key properties: precise measurement and control of the input-output relationship of a pathway, and the functional architecture of the pathway constituents themselves. In the case of signaling and metabolic pathways, the latter has meant insights into the functional significance of specific protein sequences and structures, such as being able to pinpoint exactly which protein domains and which amino acid residues are responsible for mediating specific interactions along the pathway. 1.4.1 The challenges of rewiring pathways. Initially, pathway engineering was primarily explored in the context of metabolism [Martin 2009]. Metabolic engineering typically involved the use of genetic screens and directed evolution to maximize targeted metabolic fluxes. Synthetic efforts in boosting metabolic fluxes have begun to pay off, as is exemplified in a recent study in which a synthetic protein scaffold was used to draw metabolic enzymes spatially closer to each other [Dueber 2009] - however, it should be noted that this study does not involved any pathway rewiring. By contrast, rational rewiring of pathways involves specific manipulations to the components of the system to achieve a desired outcome. The most crucial aspect of protein and gene structure that synthetic biologists use to rewire pathways is the inherent modularity of many proteins [Janin 1985] (signalling proteins, for example, typically have dedicated domains for recognizing binding partners that act independently of other functional ........................... ........ b Changing pathway output with swappdsd"d mott~vation 1 hYWlldtype Ew eqec TRIRLATEWSEQDG ------ YLAlSINKD -AERL1 VRLRYRLEM6DWI.SA --- Example: Ught-drivn Omppathway dstrm~iftnmidm D EnvZ/RstB chimWc sequence moo 090Q90 Position ingradient C Chang pathway output with 5timulus gneered adapter protein input Forskolin n 60- mu20 prti enieee stigtue)cnb :Rewiredmrtewton -1,Endogenous interaction effectors C output S40- R togn hvseor(genadcnb F.IOPOCM (microspikes) aet E 0N 0 0 2 B 6 4 forskobnl WjM) 0 Figure 1.4.1 Rewiring signaling pathways. As the central cartoon shows, membrane proteins (light blue) can be engineered to have sensors (green), and can be made to interact with adapters which can in turn can be made to interact with other adapters (dark blue). More formally the input/output relationship can be controlled in two ways: by changing the stimulus that a receptor is triggered by (shown schematically in panel A), or by changing the transducing molecules that the receptor uses to pass the information from the environment to the cellular interior (panels B and C). Chimeric photoreceptors illustrate the first type of change. Although chimeric receptors have been used previously [Kwon 2003], photoreceptors allow for much higher sensitivity measurements and avoid crosstalk effects. In the case of E coli, the rewiring is accomplished by transcriptionally fusing the cytosolic signal transduction domain of the pathway sensor, the histidine kinase domain of EnvZ, to cyanobacterialphytochrome 1 (Cphl), thereby resulting in a system in which EnvZ's response regulator, OmpR, can be triggered by light [Levskaya 2005]. Pathway activity is read out by placing the lacZ gene, whose product creates a black compound, under control of the OmpR-depdendent ompC promoter. The response to a light gradient input serves as a very precise measurement of the transfer function of the pathway (panel A, lower subpanel). The transfer curve seems to indicate that the pathway operates in a threshold linear manner, though whether that is due to the phytochrome sensor itself rather than the pathway needs to be explored. Such thresholding could serve to protect the cell from overreacting to small signals. Shimizu-Sato et al. operated on similar principles in yeast, but instead fused a Gal4 binding site domain (GBD) to the red-light absorbing phytochrome form Pr and a Gal activating domain (GAD) to the phytochrome's binding partner phytochrome interacting factor 3 (PIF3), thus bypassing the galactose signaling cascade [Shimizu-Sato 2002]. Any gene of interest can thus be controlled by placing it under the control the gall promoter and simply exposing the cells to red light instead of galactose. Once activated, the signal from the sensor must be specifically transduced to affect specific downstream processes. By studying covariance among residues from interacting proteins, one can use statistical scores such as mutual information to predict which residues determine the specificity of the interaction. As shown in panel B, when specificity-determining residues from the protein RstB (shown in bold) were substituted into EnvZ, resulting in the chimeric protein Chim1, phosphotransfer occurred between EnvZ and RstA rather than EnvZ's normal partner OmpR [Skerker 2008]. Finally, a great deal of signal processing takes place in between the triggering of a sensor by the environment and the output of the pathway, especially in eukaryotes. One major intermediate in eukaryotes is the class of proteins known as guanine exchange factors (GEFs), which control morphological pathways. Yeh et al. swapped wildtype GEFs controlling formation of filopodia and lamellipodia for synthetic GEFs that can be induced by the small molecule forskolin and that generate novel morphological outputs67 . Specifically, GEFs contain an autoinhibitory domain that Yeh et al. substitute with a PKA-responsive inhibitory domain, PDZ. Placing an endogenous pathway under tunable control allows us to characterize crucial aspects of cell biology in quantitative detail. Interestingly, Yeh et al. find that the morphological output is only manifest probabilistically - it is the fraction of cells that display either filopodia (shown here) or lamellopodia (not shown) that increases with increasing forskolin. (A) lower subpanel is taken from Levskaya et al., Levskaya A, Chevlier AA, Tabor JJ, Simpson ZB, Lavery LA, Levy M, Davidson EA, Scouras A, Ellington AD, Marcotte EM, Voigt CA. Engineering bacteria to see light. Nature 438: 441-442 (2005) (B) is taken from Skerker et al. Skerker JM, Perchuk BS, Siryaporn A, Lubin EA, Ashenberg 0, Goulian M, Laub MT. Rewiring the specificity of two component signal transduction systems. Cell 133: 10431054 (2008) (C) is reproduced from Yeh et al. Yeh BJ, Rutigliano RJ, Deb A, Bar-Sagi D, Lim WA. Rewiring cellular morphology pathways with synthetic guanine nucleotide exchange factors. Nature 447: 596-600 (2007) domains); most rewiring studies therefore focus on signal transduction and genetic cascades, which are highlighted in Figure 1.4.1. There are fewer examples of achieving metabolic control through specifically designed changes in protein sequence [Rothlisberger 2008, Kaplan 2004]. Changes in the structure of an allosteric site in a metabolic enzyme are more prone to alter the active site of the enzyme than is the case with signaling proteins [Bhattacharyya 2006]. This property allows for regulation of metabolic fluxes by effects such as allostery, but the relative lack of modularity also makes it difficult to forward engineer new behaviors by altering one domain but holding all others constant. Even within signaling systems, however, researchers are presented with severe challenges. Among the major limitations in understanding the signal propagation characteristics of many pathways is confusion over what cue triggers the cascade and whether the cue affects other processes taking place in the cell. Take for example the case of osmotic shock. While many organisms have dedicated signaling systems to relay information about an osmotic shock to the cell, the presence of abundant osmolyte will affect numerous processes besides signaling, such as global transcription-factor binding [Proft 2004]. The examples described below illustrate how techniques that both specifically and sensitively activate a selected cascade allow one to focus on pathway behavior independently of such off-target effects. 1.4.2 Manipulating the sensors. One of the most direct ways of rewiring the input-output relationship of a pathway is by directly changing the cue that the pathway sensor responds to. If the cue is chosen such that its level can be directly modulated, then one can measure pathway transfer functions much as was described above for promoters. Armbruster et al., for example, generated a G protein coupled receptor (GPCR) that responded to a pharmacologically inert compound that could then be titrated in to measure pathway response [Armbruster 2007], while Anderson et al. engineered sensors that can detect changes in tumour-related microenvironments [Anderson 2006]. Alternatively, one can manipulate the ligands that drive pathway activity, as was done by Cironi et al. when they linked together epidermal growth factor (EGF) and mutated forms of interferon a-2a (IFNa-2a) such that the only cells that could correctly respond to the IFNa-2a signal were those that coexpressed the EGF receptor [Cironi 2008]. A particularly striking example of how sensor rewiring can shed light on the operation of a cascade in vivo in a sensitive and specific manner can be found in the use of chimeric photoreceptors, shown in Figure 1.4.1 A. Two studies used light itself as the cue to drive a signaling system [Levskaya 2005, Shimizu-Sato 2002]; this approach is unlike traditional implementations of light-driven systems [Cruz 2000, Cambridge 2006, Dugave 2003], such as those that use light to activate a small molecule that then activates a desired biological process [Young 2007]. Levskaya et al. engineered the Escherichia coli EnvZ/OmpR two-component system to respond to light, while Shimizu-Sato et al. focused on the Saccharomyces cerevisiae galactose utilization pathway, by fusing a phytochrome and its binding partner to selected pathway proteins. Armed with this engineered cascade, Levskaya et al. proceeded to map out the input-output transfer function with very high precision by exposing a lawn of rewired bacteria to a light gradient. The transfer function measured in Levskaya et al. suggests that a threshold level of the environmental cue is needed before triggering pathway activity. While careful titration of an osmolyte would have allowed precision measurement of the transfer function, such as through the use of microfluidic devices [Taylor 2009], matching the sensitivity of a simple light gradient will be difficult to accomplish. Furthermore, matching the specificity of using light to drive pathway activity is probably impossible. Given the ease with which we can deliver precisely controlled light signals to cells compared to delivering chemical signals, the Levskaya et al. and Shimizu-Sato et al. studies can be easily extended to perform tasks such as measuring Bode plots, as was recently done for the yeast osmoresponse system [Mettetal 2008, Hersen 2008]. 1.4.3 Manipulating sensor/transducer interactions. Whereas swapping the sensor in a signaling pathway is a way to engineer the input side of the input-output relationship, changing the identity of the molecules that carry the signal from sensor to downstream effectors can affect the output side. In fact, given the high degree of sequence homology between many sensor/transducer pairs, there is great interest in developing a detailed description of sensor/transducer interactions to understand the multiple ways in which pathways prevent crosstalk [Ubersax 2007] - for example, by using scaffold proteins [Harris 2001], mutual inhibition [McClean 2007], and kinetic insulation [Behar 2007]. This is the basic strategy that was followed by Skerker et al to rewire the EnvZ/OmpR system [Skerker 2008]. This study made heavy use of the large amount of sequence data available for two-component systems to computationally detect individual amino acid residues that covary between cognate pairs. Specifically, they calculated the mutual information between all possible pairs of residues from sensors and response regulators and found the pair that maximized mutual information. These pairs were hypothesized to be the specificity-determining residues. Remarkably they then substituted a given sensor's specificity-determining residues for a different sensor's specificity-determining residues, keeping all other residues intact, and thereby activated the latter sensor's pathway with the former sensor's trigger. Furthermore -they perform the same rewiring feat by substituting specificity-determining residues in the response regulator; their results are highlighted in Figure 1.4.1 B. For now, the relative paucity of sequence data precludes the use of this technique for other systems, such as eukaryotic homologues of two-component systems. Nevertheless, this study provides a framework in which one can go beyond crude domain-level protein engineering all the way to molecular details. . A particularly enticing possibility, which is explored in Skerker et al., is to unite the bioinformatically guided rewiring approach with data on crystal structure, especially structures of protein-protein complexes. Using a crystal structure of a complex made up of proteins similar to EnvZ and OmpR, Skerker et al. show that the specificity-determining residues for both sensor kinase and response regulator probably occur at the interface of the two proteins, suggesting that the coevolving residues interact physically rather than allosterically. Combining structural and rewired pathway data can indicate how to explore further the numerous systems in which docking site interactions have been identified [Tatebayashi 2003, Remenyi 2006]. Synthetic pathways and crystallography together can be key in unraveling fundamental biophysical interactions underlying signal transduction. 1.4.4 Manipulating the intermediate transducers. Altering the way in which a sensor interacts with its environmental cues and its immediate downstream signaling partner represents the most obvious way to manipulate signal transduction. The next most obvious idea is to follow the signal and tackle the intermediate transducers in the pathway. Howard et al., for example, took the proapoptotic Fadd death domain and fused it to Grb2 and ShcA, members of the receptor' tyrosine kinase (RTK) pathway; as a result, RTK-triggered signals could be used to drive apoptosis [Howard 2003]. At the adapter level, one key target for pathway engineering is the family of guanine nucleotide exchange factors (GEF) that regulate the actin cytoskeleton through the Rho family of GTPases [Yeh 2007]. Yeh et al. exploited the presence of an autoinhibitory domain in GEFs that can be swapped for an inhibitory domain that itself is under the control of a small molecule. Yeh et al., shown in Figure 1.4.1C, swap wildtype GEFs controlling formation of filopodia and lamellipodia for synthetic GEFs that can be induced by the small molecule forskolin. In this study, Yeh et al. daisy-chained two GEFs in series and show that the combined, and thus longer, GEF system is both more sensitive to inducer and displays a sharper separation between ON and OFF states. These results are exactly what one would expect from previous synthetic studies examining the sensitivity and sharpness of transcriptional cascades as the cascade length is varied [Hooshangi 2005]. As seen above in the case of apoptosis in the RNAi switch, placing an endogenous pathway - morphological in this case - under tunable control allows us to characterize crucial aspects of cell biology in quantitative detail. 1.4.5 Connecting pathway rewiring to evolvability Another interesting and complementary theme that emerges from rewiring studies is how differently rewired circuits can yield the same output. The library of combinatorially synthesized gene networks constructed by Guet et al. contains instances of systems that have different connectivity properties but the same Boolean truth table and those that have the same connectivities but different boolean truth tables [Guet 2002]. Along these same lines, Isalan et al. show that randomly rewiring the transcriptional network of E. coli results in growth defects in only 5% of the rewirings, a level of tolerance difficult for manmade systems to replicate [Isalan 2008]. The idea of rewiring a circuit but maintaining its logic seems also to have been employed in the evolution of the mating type switch in yeast, where Candida albicans a genes activate the a mating type while Saccharomyces cerevisiae alpha genes represses the a mating type [Tsong 2006]. Theoretical studies on the evolvability of biochemical networks suggests that networks that are wired differently but produce the same output constitute a 'neutral space' that allows flexibility in the design of networks to perform some function and thus eases the way for phenotypic changes to take place [Wagner 2007, Gerhart 2007]. Continuing in the theme of using rewired pathways to highlight system flexibility, Antunes et al. transplant a bacterial two-component system into the eukaryotic plant Arabidopsis thaliana. Thee prokaryotic transcriptional activator manages to cross into the nucleus to drive gene expression, fueling speculation that pathway evolution can even be driven by horizontal gene transfer between organisms from different kingdoms of life [Antunes 2009]. 1.5 Synthetic feedback networks. Synthesis has uncovered several rules governing how DNA is turned into proteins and then how proteins interact to generate diverse phenotypes without the need for a combinatorial explosion in the number of genes. In the examples considered above, however, the flow of information was largely an ordered sequence of events: diverse outcomes in these systems resulted from combinatorial rearrangements of modular parts. The complexity of naturally occurring cellular networks, however, is often dominated by feedback and feedforward loops. By incorporating these features, synthetic circuits have also taught us about the dynamics and systems-level function of more complex molecular interactions. Initial work in this area primarily focused on the identification [Alon 2007] and experimental characterization of simple motifs that occur frequently in genetic and signaling networks. In this first generation of synthetic biology, studies mimicked natural systems and confirmed theoretical expectations that positive feedback systems can be bistable [Maeda 2006, Becskei 2001, Ozbudak 2004, Isaacs 2003], negative feedback systems are noise resistant [Becskei 2000] and can speed up circuit dynamics [Rosenfeld 2002]. More recently, engineered feedback loops have been extended to signaling and metabolic systems by generating novel protein-protein and genetic interactions to explore how signaling pathways set their sensitivity to input and how they tune their kinetics [Bashor 2008, Fung 2005]. One concrete way, highlighted in section 1.5.2, in which synthetic circuits are helping us approach more complicated interaction networks is by serving as benchmarks against which theoretical and computational tools can be tested [Cantone 2009, Ellis 2009]. 1.5.1 Oscillatory behaviors. To make the lessons concrete, we focus on how biological parts can be arranged to create a biologically relevant dynamical system: an oscillator. Cells display a range of oscillatory behaviors. Some oscillators have tunable periods, such as the dependence of the cell cycle period on nutrient levels available, whereas others are more robust to changes in parameters, such as the circadian oscillator. Examples include oscillatory signaling from nuclear factor kappa B (NFkB), which governs its control over gene expression [Nelson 2004], and p53-murine double minute 2 (Mdm2), which oscillates to drive the DNA damage response [Geva-Zatorsky 2006]. How can one construct a robust yet tunable oscillator in a living cell? The construction of in vivo oscillators provides a particularly beautiful example of how the interplay of analysis of naturally occurring systems, modeling, and construction of synthetic systems can yield deep insights into biological phenomena. The story here begins with the observation that the simplest oscillator design, a delayed negative feedback, cannot sustain oscillations beyond a small number of periods when operating in a cell. Instead, as highlighted in Figure 1.5.1, naturally occurring oscillators hinted at the crucial role of interlocking positive feedback in maintaining a robust oscillator, which was employed in the genetic oscillators recently syntheisized by Stricker et al [Stricker 2008] and Tigges et al [Tigges 2009]. As the studies in Figure 1.5.1 show, oscillators, in addition to being fun to watch, are among the simplest in vivo systems that can be used to understand interactions between different types of feedback loop. While systems biologists are increasingly comfortable with our understanding of simple motifs, we cannot say the same about interactions of those motifs. It is worth considering, for example, that even for interlocking positive and negative feedback loops multiple behaviors are possible as one varies the parameters of the system and includes stochastic effects. For example, in the yeast galactose utilization pathway, the negative feedback effectively counteracts the positive feedback and limits the parameter space over which the system is bistable [Acar 2005]. Beyond two or three loops, however, we are usually at a loss to describe the system - especially a natural one that may contain even more interactions than is being accounted for. Synthetic circuits are helping us systematically understand how motifs interact to generate ever-richer behavior. .. ....................... o a Ana~fse natural circuit to improve on initial 120- synthetic design S100- so - b Cycn B Cdc25. 60- A Cdc2O DKl-CydinB Myt (C 20- 0 02 0 100 200 0Mitotic 300 400 Time (mn) 500 phosphorylations 600 implement findings from dual feedback natural network in a synthetic sy stem C Microbial Mammalian 10- 12. 2.0- .A 12 6- 20 01.0- .0- 0 60 120 Time (min) 180 0 200 400 600 80 Time (mn) 1,000 1.200 Figure 1.5.1 Building a robust, tunable oscillator in a living cell. The simplest way to achieve oscillation is through use of a delayed negative feedback loop [Conrad 2006]. Imagine that you construct a system with two genes, A and B, and that protein A activates the transcription of B whereas protein B inhibits the transcription of A. Turning on gene A leads to build up of the protein A, but also of protein B. After some time, enough protein B builds up to cause protein A levels to decrease - this then results in a decrease in protein B levels, which allows protein A levels to rise, and so on. However, when one builds a simple negative feedback circuit as described above, the oscillations are in general not robust. In the repressilator of Elowitz and Leibler [Elowitz 2000], which consists of a cycle of 3 transcriptional repressors and a fluorescent protein readout (panel A), the oscillators fall out of phase and damp out following a small number of cycles. Swinburne et al. engineered an autoinhibitory circuit in which the delay timescale in the negative feedback was set by the length of an intron engineered into the construct; they also find that even for a given intron length the oscillation period varies widely from cell to cell [Swinburne 2008]. The source of the damping in both cases can be found in the stochastic nature of gene expression: random amounts of protein produced at random times result in uncoordinated behavior that causes the components making up the oscillator to fall out of phase. The synthetic genetic oscillator was missing a key ingredient. A strong hint as to the identity of that key ingredient was provided by the analysis of naturally occurring oscillators. In particular, the cell cycle oscillator contained interlocking positive feedback loops in addition to the core negative feedback loop that was generally assumed to generate the oscillations (panel B). Experiments in the cell cycle of frog embryos along with computational simulations suggested that the positive feedback loops could stabilize two states that the system would cycle between via the negative feedback loop [Pomerening 2003, Pomerening 2005, Tsai 2008], creating a relaxation oscillator. Could something as simple as positive feedback be responsible for robustness in genetic oscillators in organisms as diverse as bacteria to mammals? And can positive feedback enable cells to independently tune the amplitude and frequency of the oscillations? Two recent studies, in agreement with earlier work [Atkinson 2003], indicate that coupling positive and negative feedback is indeed sufficient to ensure stable oscillations. Stricker et al. implemented a transcriptional circuit in E. coli that integrates a positive and negative feedback loop in a common inducible promoter [Stricker 2008] (panel c), while Tigges et al., working in mammalian cells, used transcriptional positive feedback and negative feedback mediated by transcription of an antisense RNA [Tigges 2009]. Experimentally, Stricker et al. observe that the dual feedback oscillator is robust to a number of perturbations, including changes in inducer level and temperature; these features could not be adequately described by their initial modeling of this circuit [Hasty 2002]. It was only through the addition of various biological steps in the negative feedback, such as TF-DNA binding and multimerization, that the model could reproduce the robustness of the oscillator to parameter changes. The authors conclude that from the point of view of the oscillator's operation, what matters is not the details of what processes make up the negative feedback but instead that the negative feedback includes a delay; by contrast, the positive feedback only ensures robustness and tunability. The system built by Tigges et al. shares many of these details, with the delay in the negative feedback coming from post-transcriptional repression of the circuit's transcriptional activator, but the system itself is sensitive to molecular details such as the relative ratios of the circuit components - for some ratios of circuit components oscillations are abolished. (A) is taken from Elowitz MB, Leibler S. A synthetic oscillatory network of transcriptional regulators. Nature 403(6767): 335-338 (2000) (B) is reproduced from Tsai TY, Choi YS, Ma W, Pomerening JR, Tang C, Ferrell JE. Robust, tunable biological oscillations from interlinked positive and negative feedback loops. Science 321(5885): 126-129 (2008) (C) is taken from Stricker J, Cookson S, Bennett MR, Mather WH, Tsimring LS, Hasty J A fast, robust and tunable synthetic gene oscillator. Nature 456(7221): 516-519 (2008) and Tigges M, Marquez-Lago TT, Stelling J, Fussenegger M. A tunable synthetic mammalian oscillator. Nature 457(7227): 309-312 (2009) 1.5.2 Using synthetic circuits as modeling benchmarks One of the most important functions that synthetic circuits have served has been their use in building and refining analytic and computational models of biological systems. When modeling a gene or protein circuit, one must make a series of choices. The first choice has to do with how fine a scale one wishes to model the input/output relationship typically this choice boils down to whether one wants to view the system as a Boolean logic operator or a dynamical system. The dynamical system framework can be further broken down along 2 dimensions, depending on whether spatial or stochastic effects need to be taken into account. Spatial effects can usually be ignored when the biochemical reactions that make up the system occur on timescales slower than the time it takes to mix the reactants by diffusion. Stochastic effects can usually be ignored if the dynamical variables of the system can be represented as continuous rather than discrete entities; that is, when we are interested in the concentrations of a molecule rather than the number of molecules. Synthetic circuits have been used to explore all of these issues in some detail. Until recently, the choice of modeling methodology was based on one's best guess for which effects were important to include, along with post-hoc comparison of the model with data. Detailed comparisons of different modeling paradigms have been lacking. Cantone et al. [Cantone 2009] and Ellis et al. [Ellis 2009] have offered the field some guidance through the introduction of benchmark networks - that is, a network that has a defined topology that interacts only minimally with endogenous systems, against which to test proposed modeling methods. In particular, Cantone et al. create a relatively sophisticated synthetic transcriptional network of 5 genes that serves as an oracle that is queried by different perturbations, such as overexpression of the network genes and induction by transcriptional inducers. Finally they test methods based on ordinary differential equations, bayesian inference, and information theory to uncover the connectivity of the network; they find that differential equations and Bayesian inference were better at uncovering the functional relationships than the information theory-based approach, as expected for such a small network. Cantone et al. thus provide an example of how synthetic circuits can be helpful in refining our understanding of large-scale biological systems by improving the algorithms we use to analyse genomic and proteomic datasets. 1.6 The ultimate goal: spatiotemporal control. If there is one context in which all of the various biological processes tackled by synthetic biologists come together it is in the engineering of spatiotemporal interactions, both intracellular and intercellular. Engineering cell-cell interactions in a rational manner requires us to master rational manipulation of communication devices (signaling pathways), using promoters to specify desired transcriptional responses to a given signal strength, and arrange these elements in a circuit architecture that robustly encodes the function we are trying to implement. If we hope to systematically build up our understanding of functional compartments of the cell, development, and ecology then it is imperative that we integrate lessons learned from diverse areas of synthetic biology. 1.6.1 Uncovering intra- and intercellular processes Perhaps the most striking feature of the eukaryotic cell is its organization into functional subcompartments: the nucleus for genetic material, mitochondria for respiration, endoplasmic reticulum (ER) for protein production, etc. For the eukaryotic cell to accomplish its tasks, the behavior of these compartments must be coordinated in space and time. A recent study in S. cerevisiae from has yielded new insight on how the mitochondrion and ER communicate, by using a genetic screen coupled with a synthetic construct that is designed to specifically tether the two organelles [Kornmann 2009]. Kornmann et al. find that the synthetic tether complements mutations in maintainance of mitochondrial morphology 1 (Mmml), mitochondrial distribution and morphology 10 (Mdm1O), 12 (Mdml2), and 34 (Mdm34), thus identifying these 4 proteins as constituents of a complex that ties the organelles together and allows the exchange of phospholipids (needed by the mitochondrial membranes) and calcium (which acts as a signaling molecule between the two). Two properties that we still cannot reliably engineer are the dynamics of a circuit and spatial control. Both these behaviours have one major biological process in common: development. In anticipation of one day tackling developmental processes and other intercellular pathways, some groups have designed circuits to spatiotemporally control gene expression. Using a network mimicking naturally occurring feedforward circuits, for example, Basu et al. have designed cells that can respond to the signal acyl-homoserine lactone (AHL) from nearby cells but ignore equal concentrations of this signal from faraway cells [Basu 2004]. This feat is accomplished by a key property of the feedforward network in the signal receiving cells - it responds not only to the concentration of the signal but also to the rate of increase of that concentration. Signal sending cells nearby signal receiving cells increase the rate of AHL concentration more rapidly than distant sending cells. Basu et al. built on this work to create a circuit that could respond to only a narrow range of AHL signal, much like a band filter, thereby exhibiting another feature of developmental processes [Basu 2005]. The exquisite coordination that is a hallmark of development also almost certainly requires the use of networks that can act as genetic timers and counters. Friedland et al. have provided a design for a network that constitutively pumps out GFP mRNA transcripts that are translationally inhibited but whose inhibition can be lifted by a transactivating RNA (taRNA) [Friedland 2009]; the transcription of the taRNA is inducible by arabinose and so the network output, in the form of discrete amounts of GFP, represents pulses of arabinose. Finally, Isalan et al. have gone as far as building a mock-up of a realistic D. melanogaster embryo, modeling the syncytium as a collection of paramagnetic beads coated with DNA, in which genetic networks analogous to the gap gene system can be placed [Isalan 2005]. Interestingly, this 'minimal embryo' leads the authors to suggest that pattern formation in the real embryo requires activator molecules to propagate faster than inhibitors, implying that the gap system is a reaction-diffusion system that uses a mechanism quite unlike Turing instabilities to lay down patterns. As the authors point out, this is hardly surprising given that the gap system uses nonhomogeneous initial conditions in the form of spatially localized components deposited in the insect egg and as the activator is not autocatalytic. Whether these lessons carry over to their natural settings remains to be seen. 1.6.2 Modelling ecological interactions. As is the case with the band filter circuits described above, most synthetic circuits involved in cell-cell communication make use of the quorum sensing pathway; one such circuit is highlighted in Figure 1.6.1 [Tanouchi 2009]. These circuits usually borrow components from organisms like Vibriofischeri, although attempts at incorporating other systems have also been successful [Bulter 2004, Chen 2005]. Examples of using such systems to study natural phenomena are more limited. Balagadde et al., by adapting an earlier design [You 2004], used the quorum sensing proteins to drive expression of an antibiotic to create a synthetic predator-prey system [Balagadde 2008], while Brenner et al. used a similar system to study the ability of cells to signal in the context of a biofilm [Brenner 2007]. Synthetk predator-prey systm 1..20 10- 0 0 40 80 120 Time (hr) 160 200 Figure 1.6.1 Building a synthetic predator-prey system from quorum sensing components: using synthetic circuits to engineer cell-cell interactions. Studies of ecology and evolution are often dependent on carefully characterizing the interactions of different organisms. In a natural setting, however, such data collection often proves to be noisy at best and impossible at worst. At the same time, mathematical models in theoretical ecology and evolutionary biology are among the most sophisticated in all of the life sciences. Laboratory-scale experiments on cellular interactions could quantitatively test some of the remarkable predictions and open the way to new theory. Among the most elementary interactions in nature is the predator-prey interaction. The prey in this case produces the quorum sensing pathway protein LuxI, which is engineered to drive a transcriptional cascade in the predator that produces CedA, which inhibits the DNA replication inhibitor CedB thereby allowing the predators to replicate. Meanwhile the predator produces the quorum sensing pathway protein LasI which activates CcdB in a LasR-dependent manner in the prey. CcdB expression in the prey prevents it from replicating. The cyclic dynamic is very similar in style to genetic oscillators: high levels of prey leads to low levels of CedB and thus high levels of predator; high levels of predator leads to high levels of CcdB and thus low levels of prey, which subsequently leads to high levels of CcdB in predators, etc. As shown in Balagadde et al., predatorprey interactions can thus result in limit cycle oscillations about an unstable fixed point of the dynamics, most commonly studied in the framework of the Lotka-Volterra model. Reproduced with permission from Balagadde FK, Song H, Ozaki J, Collins CH, Barnet M, Arnold FH, Quake SR, You L. A synthetic Escherichia coli predator-prey ecosystem. Mol Syst Biol 4: 187 (2008) Chuang et al. recently have used engineered circuits shown in Figure 1.6.2 for cell-cell interactions to study the evolutionary phenomenon of Simpson's paradox, in which the cells that provide a useful product to the population make up a diminishing fraction of the population but nevertheless increase in absolute number by promoting population growth [Chuang 2009]. Gore et al. provide another example of synthetic ecology in their study of the evolutionary game dynamics underlying sucrose metabolism in yeast [Gore 2009]. The study establishes that sucrose metabolism can be thought of as a snowdrift game, in which both cells that metabolize sucrose (cooperators) and those that do not (cheaters) stably coexist in a population, thereby opening an avenue to show how competition between different alleles can actually promote diversity in a population. ............ . ........................ . ................... ................... . Smpson paradOK inongin..rmd int3ractions * RN autoinducer P.t ecat 10-weE pools Vector legend Vector legend. P Pf a= predicted p' 068 E 43 0.4 -10- 0 -~0.2- I I I1 1 0.2 0.4 0.6 0.8 I 1 0.2 0.4 0.6 0.8 Initial pJ and final (p'=p + Ap) Initial (p)and final (p'=p + Ap) producer proportions global producer proportions 1 Figure 1.6.2 Simpson's paradox observed in an engineered set of cell-cell interactions. Simpson's paradox is a statistical phenomenon that captures the fact that even if the producer of a common good grows at a slower rate in all given subpopulations than a nonproducer, it can nevertheless make up an increasing fraction of the population as a whole. While Simpson's paradox usually arises as a result of misinterpretation of data, natural populations can in fact display heterogeneities in sample size that often underlie the paradox. The particular implementation in Chuang et al. casts bacteria that generate the autoinducer Rhl as the producer. Both producer and nonproducers use this Rhl that is rewired to activate synthesis of a chloramphenicol resistance gene catL VA. As shown in the middle panel, in each subpopulation the fraction of producers decreases, but as the bottom panel shows in the global population the fraction of producers actually increases, thus satisfying Simpson's paradox. Reproduced from Chuang JS, Rivoire 0, Leibler S. Simpson's paradox in a synthetic microbial system. Science 323(5911): 272-275 (2009). . .............. Studies such as these on fundamental aspects of ecology and evolution are difficult to carry out in natural environments due to the multiplicity of confounding factors, but synthetically engineered populations provide a way to cleanly separate different effects. More generally, studies on engineered populations not only highlight the ability to connect the molecular details of a network to population level effects but also the utility of abstracting away from such details and focusing on the interactions between cells Taking sucrose metabolism from Gore et al. as an example, it was possible to predict population level responses to changes in the cost of cooperation just on the basis of the game theoretic characterization of the interaction between cheaters and cooperators, with no direct knowledge of the molecular details. Indeed, this approach of constructing synthetic systems dedicated to characterize how cells interact can be very useful in cases such as cancer dynamics, where the underlying molecular details are either poorly understood or exceedingly complicated but population level measurements are both feasible and relevant to understanding the phenomenon. 1.7 Perspectives The synthetic biology community has made great strides in working out some of the most basic features of regulatory networks and cellular pathways. We are exerting greater control over the process of gene expression, and we have a wealth of information regarding the effects of network topology on system function. Topological details such as connectivity, cascade length, feedback structure have been explored. But there is much work yet to do before we can treat biological circuits like we treat electronic ones. In the future, we can expect to see that the synthetic circuits deployed in cells will be of growing complexity, and should increasingly integrate diverse processes, as has been done for genetic regulation and metabolism [Fung 2005]. We should also expect to see increasing contact with large-scale cell biology, such as through the creation of synthetic organelles, whose in vivo construction will be guided by synthetic regulatory networks. Progress along these fronts is limited by many of the same obstacles found across the sub disciplines of biology: we are still in need of more ways to specifically modulate the expression level of genes of interest, the activity state of pathways of interest, and we require more sensitive techniques (ideally at single-molecule resolution) to measure the abundance of mRNAs, proteins and specifically modified proteins in live cells. One of the main ways in which methodological advances will be useful is in tightly constraining models of biological networks. Obstacles to rapidly moving synthetic circuits from the blackboard to the cell can often be traced to the fact that the system under study does not behave as initial modeling indicates. This, in turn, is usually due to the fact that the systems are underdetermined, meaning that many different models can usually describe the circuit data. Higher resolution data, both in terms of abundances of the relevant molecules and as a function of time, will constrain the space of possible models significantly and should allow for more rational, predictable design processes. Assuming these technical obstacles are overcome, in a future where man-made circuits increasingly look like their byzantine natural counterparts, it is not unreasonable to expect nearly synthetic or fully synthetic cells to make their appearance. At these extreme levels of complexity, it may prove difficult or even unhelpful to mechanistically model the relevant systems. It is likely, however, to prove useful to compare the performance of natural and synthetic circuits and cells in a rigorous fashion - perhaps through the formulation of a Turing test for synthetic biology - as differences in performance can point to possible design principles. Looking back on the various examples of circuits and processes that synthetic biologists have examined, we can see that the utility of synthetic circuits can be measured along 3 different dimensions. First, synthetic circuits can serve as easily manipulable toy models that we can characterize in exacting quantitative detail and thereby build intuition for how similarly structured natural networks operate. Second, synthetic circuits can be used to allow us control over natural networks and so make discoveries about the molecular and cell biology underlying important physiological processes. Third, on a more conceptual level, synthetic systems provide clear evidence that one can generate complexity by rearranging even well-known parts, thus bolstering claims on the evolvability of natural systems. While we are still very far from rationally assembling a living organism from scratch, and far from understanding all the design principles according to which biological networks operate, the first generation of synthetically designed systems have offered us a glimpse at the need to weave our tools from disparate processes from transcriptional regulation to signal transduction in order to approach fundamental questions in modem biology. 50 Chapter 2 Robust yet tunable regulatory networks: the case of the yeast osmosensing pathway 2.1 Summary Genetic variation underlies much of the phenotypic diversity observed in nature. However, the functional robustness of cellular networks to coding sequence variations of its component genes is often difficult to quantify. Here, we challenged the osmosensing signaling pathway in the budding yeast Saccharomyces cerevisiae by systematically swapping each component gene, except for its terminal MAPK gene HOG1, with its orthologs from various yeast species, and measured their abilities to recapitulate wildtype signaling. We found that signaling was significantly altered by sequence variation in the downstream MAPK cascade genes, but remained relatively robust to changes in the upstream phosphorelay components. These experimental findings are consistent with a computational sensitivity analysis that predicts that HOG signaling is most sensitive to kinetic parameter changes involving the MAPK cascade genes. We then performed evolution experiments on yeast cells with hyperactive HOG signaling, and found that they rapidly adapted and restored wild-type fitness and signaling predominantly due to point mutations in the MAPK genes. Our results suggest that the skewed sensitivities of signaling dynamics to underlying component variations is a direct consequence of its biochemical circuitry, and might impact the evolvability of this network. 2.2 Introduction Remarkably, organisms can exhibit phenotypic robustness against a diverse array of stochastic, environmental and genetic variation. Genetic robustness in particular, can endow organisms with the ability to maintain phenotypic stability against genetic perturbations [de Visser et al., 2003], thus making them less vulnerable to mutations. Despite its apparent prevalence in nature, understanding of genetic robustness and its consequence for evolvability remain elusive. Mathematical modeling has played a pivotal role in advancing the study of robustness [Barkai 1997, Alon 1999, von Dassow 2000, Kitano 2004]. Using quantitative modeling, genetic variation can be simulated by varying the kinetic parameters in the dynamical model, and the system's robustness to genetic perturbations can be predicted. Despite the widespread utility of such computational approaches, there has been a dearth of experimental studies that either directly or comprehensively test these predictions. Here, we take a three-pronged strategy combining experimental, computational and evolutionary approaches to investigate the robustness of cellular signaling to genetic perturbations of its underlying molecular network. We employ the high osmolarity glycerol (HOG) pathway in the budding yeast Saccharomyces cerevisiae, which forms a core module of the hyperosmotic shock response [Hohmann 2002]. This pathway is especially well suited for robustness analysis because its molecular components and interactions have been well characterized [Brewster 1993, Maeda 1994, Posas 1996, Krantz 2009]. Moreover, its network input (extracellular osmolyte concentration) and output (Hog 1 activity) can be quantitatively measured and manipulated. The HOG pathway consists of a phosphorelay chain of proteins (Slnl, Ypdl and Sskl) that acts on a downstream MAP kinase cascade (Ssk2, Pbs2 and Hogl) to ultimately modulate HogI activity, as shown in schematic form in Figure 2.2.1. When the cell faces a hyperosmotic shock, the turgor pressure on the cell membrane drops [Posas 1996], reducing the autophosphorylation of Slnl. Lack of phosphorylated Slnl limits the phosphate current in the direction of Sskl, ultimately resulting in build-up of dephosphorylated Sskl. Dephosphorylated Sskl catalyzes the phosphorylation of the MAPKKK Ssk2 and results in activation of the remainder of the MAPK pathway. Upon activation, Hogi translocates into the nucleus [Ferrigno 1998] to initiate transcriptional changes in response to the osmotic shock [O'Rourke 2004]. Sini HoI Figure 2.2.1 Schematic depiction of the Saccharomyces cerevisiae HOG pathway. The regulatory arrows depicted in grey indicate The study outlined in this chapter consists of three stages. First, we systematically generate mutant strains in which each component gene upstream of HOG1 is replaced by its orthologs from various yeast species, and then we measure their corresponding signaling dynamics. We find that while signaling is tolerant of coding sequence variation in the upstream phosphorelay genes, it is significantly less robust against changes in the MAPK cascade components. Second, we show that the experimental results are consistent with a computational robustness analysis of the pathway, which predicts that HOG signaling is most sensitive to kinetic parameter changes involving the MAPK cascade genes. Third, we underexpress the YPD1 gene in the HOG pathway to induce hyperactive signaling, and we evolve nine independent lines of the yeast strain underexpressing YPD1 using turbidostats [Acar 2008]. We find strikingly that mutations in the MAPK cascade genes i.e. PBS2 and SSK2 significantly dominate the genetic changes among the pathway genes across the majority of the independently adapted populations consistent with the computational robustness analysis. Importantly, we show that these mutations are largely responsible for the down-regulation of the hyperactive signaling and the improved fitness in the evolved strains. 2.3 HOG signaling displays varied sensitivity to ortholog substitutions To characterize the effects of sequence variations in the genes of the HOG pathway on signaling, we utilized the natural variation in the HOG pathway genes across different yeast species and systematically generated mutant strains in which each pathway gene except HOG] was replaced with its ortholog from two evolutionarily diverged yeast species i.e. Candidaglabrataand Candidaalbicans. Then, we measured their abilities to recapitulate wild-type signal propagation under a hyperosmotic shock. By using presumably functional orthologs rather than randomly mutated sequences, we more efficiently searched the space of sequences that had a reasonable chance of complementing wild-type behavior. Compared with S. cerevisiae, all C. glabrata pathway proteins had sequence ClustalW similarity scores between 50 and 60, except Ssk1 which scored 37. C. albicans, being evolutionarily more distant from S. cerevisiae than C. glabrata, displayed lower sequence conservation for all the pathway proteins, ranging from 22 to 46. To estimate the degree of protein functional changes manifested by the sequence divergence of the orthologs, we computed for each ortholog the percentage of amino acid changes at highly conserved residues ("functional score") identified from comparative genomic analyses of the HOG pathway proteins across various fungi species (Supplemental Data, Krantz et al., 2006). We calculated "functional score" as the percentage of amino acid changes in the orthologous sequence compared to that of the S. cerevisiae sequence at conserved residues identified through multiple sequence alignment of orthologous genes from twenty fungal species (Krantz et al., 2006). Here, we consider a residue as being conserved if either all the residues at that position are identical across all sequences in the alignment, or if conserved or semiconserved substitutions are observed. ........................... In order to assay for pathway activity, we made use of the fact that pathway activation leads to increased localization of the Hog1 protein to the nucleus. To this end we fused the yellow fluorescent protein (YFP) to the C-terminal domain of Hog1 to track its subcellular localization, and we labeled the nucleus of each single cell by using strains that contained a fusion of the nuclear pore protein Nrdl to the red fluorescent protein (RFP). To drive pathway activity, we exposed the cells to standard dropout media containing O.4M sodium chloride (NaCl); the media used to culture the cells was controlled by means of a fluid cell device which allowed for computer-controlled switching between media with and without NaCl as shown in Figure 2.2.2. Top View Slide Gasket Computer control Inlet Cells Coverslip with ConA Side View Figure 2.3.1 Schematic diagram of flow-cell setup. Reproduced from [Mettetal 2008]. 2.3.1 Systematic complementation study We found in Figure 2.3.2 that upon a hyperosmotic shock, the Hogl nuclear enrichment dynamics of the Slnl- and Ypdl-ortholog hybrid pathways (from both yeast species) were indistinguishable from that of the wild-type response despite their high functional scores. By contrast, the majority of Ssk2- and Pbs2-ortholog hybrid pathways displayed grossly defective signaling. While the hybrid pathways comprising the C. glabrataand C. albicans phosphotransfer module proteins exhibited almost identical initial Hog1 phosphorylation rate, peak Hog1 nuclear enrichment, and adaptation time compared to wild-type, their protein kinase counterparts displayed significant signaling changes such as decreased initial Hog1 phosphorylation rate and peak Hog1 nuclear enrichment. Importantly, the ability of the hybrid pathways to approach wild-type signaling did not correlate in any simple way to sequence conservation. For example, C. albicans Ssk2 and Sln1 have similar functional scores indicating that each protein has a similar fraction of .. ... ... ....... .. .... .......... .... .. highly conserved amino acid residues changed, but clearly C. albicans Sln1 can complement its S. cerevisiae counterpart, while Ssk2 cannot. SLN1, WT YPD1, WT SSK2, WT . PBS2, WT 0.3 4* 60.2 E 0.2 0 9% 22 0.3 0 10 2*/ 12 2 38% 06142 31 2%17%. -~0.2 14*/ 3% 22 03172 . 0 10 1I 35 27%. 0 20 0 0 -14 2 9% L~ST 0 10 20 0 10 20 0 10 20 0 10 10 20 Time [minutes] Figure 2.3.2 Hogl nuclear enrichment dynamics in response to a 0.4 M NaCl hyperosmotic shock measured in the wild-type strain (in gray) and mutant strains with each of the pathway proteins (denoted by the different colors), except Hogl, replaced with its orthologs from C. glabrataand C. albicans. Shown in the upper right corner of each plot is the ClustalW score of the ortholog when aligned to the S. cerevisiae sequence. Right below the ClustalW score is the "functional score" which for each ortholog, represents the percentage of amino acid changes at highly conserved residues identified via comparative genomics. The traces show the average response, obtained by taking the average of population averages from independent experiments (n = 3) ± SEM. Systematic complementation study suggests that HOG pathway can tolerate large-scale protein sequence variations in phosphotransfer proteins and maintain endogenous pathway dynamics, while changes in MAPK protein sequences tune the pathway dynamics to the point of eliminating activity altogether in the case of C. albicans Ssk2. 2.3.2 Focusing on network architecture: Pbs2 versus Ypdl To further substantiate this finding, and in order to examine the role of network context in producing the observed sensitivities we focused on the proteins Ypdl and Pbs2 which belong to the phosphorelay and MAPK modules respectively. Ypdl is a phosphotransfer protein sandwiched between phosphotransfer proteins, whie Pbs2 is a kinase protein sandwiched between kinase proteins; the significance of this difference will be _ ............ highlighted in section 2.4.2. We generated strains with PBS2 and YPD1 ortholog substitutions from three evolutionarily more distant yeast species including Neurospora crassa, Debaryomyces hansenii, and Kluyverimyces lactis. Despite higher sequence divergence and functional scores of these Ypdl proteins, the data shown in Figure 2.3.3 shows that all of them still fully mimicked wild-type Hogi signaling. In contrast, signaling performance decreased with increasing sequence divergence and functional scores of the Pbs2 protein. 1.5 - - V E 1 PBS2 ---1 ITorthologs m YPD1 orthologs EIE 0.5 E 0 0 10 20 30 % of amino acid changes at well-conserved residues Figure 2.3.3 Maximum Hogl nuclear enrichment of mutant strains with orthologous YPD1 and PBS2 genes of varying degrees of "functional scores" under a 0.4 M NaCl hyperosmotic shock normalized against the wild-type response. Data point at 0 percentage change represents the wild-type response. Data depicts mean (n = 3) ± SEM. A broad range of orthologous versions of Ypdl are able to fully complement the endogenous version of Ypdl, while pathway activity decays as the substituted ortholog of Pbs2 becomes increasingly dissimilar from the native Pbs2. The absence of observed changes in Hogi dynamics for the Slnl- and Ypdl- ortholog hybrid pathways, however, did not preclude the possibility that they possessed similar kinetic constants as that of the S. cerevisiae Ypdl protein. Therefore, we took advantage of a previous in vitro study that had characterized Ypdl mutants with drastic changes in either the phosphotransfer rate ksnzjpsyd1 or binding constant Kdln1pypd1 between phosphorylated Slnl and Ypdl (Janiak-Spens et al., 2005). We transformed these Ypdl alleles into our wild-type strain. We then measured their signaling abilities under the same hyperosmotic shock. None of the strains with kinetically defective alleles displayed significant changes in Hog1 signaling dynamics compared to wild-type, even in the case where kslalp-ypdj was reduced by 17-fold; this is evident by the flat curve in Figure 2.3.4. Using these kinetically defective Ypd I alleles, we were able to directly rule out the possibility that, in the case of Ypdl, the ortholog complementation tests were leading us to a false conclusion that Ypd1 rates seem not to affect cascade dynamics. Taken together, these complementation experiments led us to hypothesize that HOG signaling is likely to be more robust to variations in parameters affecting the upstream phosphotransfer relay than the downstream MAPK cascade. 1.5 mutants E o~: 0.5 0 -20 -10 0 10 ksn1-yPdl (s-1) or KdSIMP-Ypdl (pM) fold change (relative to WT-YPDI) Figure 2.3.4 Maximum Hog1 nuclear enrichment of mutant strains with characterized YPD1 alleles under a 0.4 M NaCl hyperosmotic shock normalized against the wild-type response. Two of the alleles exhibit a three- and seventeen-fold reduction in the Slnl-to-Ypdl phosphotransfer rate ksljP-ypd1, while another has a three-fold increase in the binding constant Kdsn1paydl compared to wild-type Ypdl [Janiak-Spens 2004]. Data point at 0 fold change represents the wild-type response. Data depicts mean (n = 3) ± SEM. Kinetic mutants of Ypdl, much like orthologous versions of Ypdl, were readily able to complement the wildtype Ypdl in terms of pathway dynamics, demonstrating that the HOG pathway is robust to variations in the kinetic rates associated with the internal details of the phosphotransfer module. 2.4 Computational analysis of HOG pathway To computationally investigate the effects of coding sequence variations in the HOG pathway genes on signaling dynamics, we performed sensitivity analyses on key dynamical properties of the signaling module, i.e. the peak Hogi phosphorylation level MHog1 and the initial Hog1 phosphorylation rate rHgi, using a simplified biochemical network model [Klipp 2005]. Similar to an approach used to study the segmentation polarity network in fruit flies [Dassow 2000] we modeled changes in coding sequences as changes in the kinetic rate constants parametrizing the dynamical model. The effects of coding sequence variations in each pathway gene, except HOG], were simulated by simultaneously varying all three rate constants associated with the protein over two orders of magnitude about wild-type levels and computing the corresponding signaling outputs i.e. MHogJ and rHogi. Figure 2.4.1 illustrates how MHogJ changes as two of the three rate constants associated with either Pbs2 (Figure 2.4.1, left panel) or Ypdl (Figure 2.4.1, right panel) are varied about wild-type parameters. We observed a strikingly flat surface for Ypdl-associated parameter changes, indicating that MHogJ remains almost unchanged over a wide range of parameter space. In contrast, Pbs2associated parameter changes significantly altered the MHogJ landscape. To systematically compare the effects of parameter variations across individual pathway proteins on signaling, we computed the local logarithmic gradient of the landscape evaluated at wildtype levels and defined this metric as our sensitivity measure. Phosphorelay intermediate protein Ypdl MAPKK Pbs2 C 0 0.16 .~0.1' 10.0 0.06 4 3C 0 0 20 a. -2 4 -3 .4 -2 110 2 22 kypdjj.ajni 0 0 "o 10 kAn ypdj -1 0910 k=.,s, Figure 2.4.1 Surface representations of peak Hogi phosphorylation as a function of the rates associated with a given protein. While YpdI exhibits a very flat surface, indicating that the peak HogI phosphorylated level is relatively constant over a wide range of Ypd1-associated rates, the Pbs2 surface exhibits greater curvature. In subsequent analyses, we summarize these surfaces by plotting the magnitude of their gradients evaluated at given base points, such as the estimated wildtype parameter set. 2.4.1 Sensitivity analysis of a model of the HOG pathway We implemented the following steps: i) model changes in sequence as changes in kinetic rate constants, ii) define a sensitivity metric that captures how HOG signaling changes as kinetic rate constants are varied using the model shown below in Equation 2.8. We examined several methods to execute step (ii). The first analysis involved computing the magnitude of the local logarithmic gradient about the wild-type parameter set from the model outputs namely initial Hogi phosphorylation rate and the steady state Hogi phosphorylation level. To directly compare the different model outputs, we utilized logarithmic gradient calculations to render our analysis dimensionless: i-1 ( Alnk ln ) 2[2.1] i wt where # is the model output whose sensitivity we are computing, and the k's represent the rate constants that are being varied. The wild-type parameters are obtained from Klipp et al. although similar results are obtained in a model with wild-type rate constants -- ------------------ set equal to one another. The results of this analysis for # are summarized in Figures 2.4.2 and 2.4.3. Sensitivity of peak Hogi phosphorylation level 0.08 0.06 0.04 0.02 AMM- Sin1 * Sin1 Ypd1 U Ypdl Ssk1 Ski Ssk2 Pbs2 Ssk2 mPbs2 Figure 2.4.2 Local logarithmic gradients calculated for peak Hog1 Phosphorylation surfaces for sensitivity analysis of each intermediate pathway protein. As expected from the complementation tests, the MAPK proteins Ssk2 and Pbs2 show the highest curvature, implying the greatest sensitivities. Sensitivity of initial Hogi phosphorylation rate 0.60 0.45 0.30 0.15 0.00 SIn1 Ypd1. Ssk1 Ssk2 Pbs2 Figure 2.4.3. Local logarithmic gradients calculated for initial Hog1 Phosphorylation rate surfaces for sensitivity analysis of each intermediate pathway protein. Just as for Figure 2.3.2, the complementation studies qualitatively match the trend of the results of the computational analysis here with the MAPK pathway proteins Pbs2 and Ssk2 displaying much higher curvatures, i.e., higher sensitivities, than the phosphorelay proteins Slnl, Ypdl, and Sskl. To overcome the uncertainty in the wild-type parameters used in Equation 2.1, we formulated a 2 "dmetric that is less dependent on the choice of the particular wild-type parameters. This method utilizes the full distribution of$, instead of the only region around the wild-type level, and measures the relative spread of this distribution to determine the effects of variations in rate constants on $. Using the same model outputs, we computed the following modified deviation metric: - ((k) - #widtype)2 [2.2] where V is the phase space volume over which the parameters are swept. Similar to the local logarithmic gradient, large values of the modified standard deviation indicate greater sensitivity to parameter variations, while smaller values indicate greater robustness to parameter variations. The results of this analysis are shown in Figure 2.4.4. In summary, both analyses highlighted above yielded the same qualitative answer i.e. HOG signaling is most affected by changes in the rate constants of the downstream MAPK proteins and least by the upstream phosphorelay proteins. Figures 2.4.2 and 2.4.3 summarize the sensitivities of MHogJ and rHgi respectively for all pathway genes upstream of HOG1. Consistent with the complementation results, both MHogl and rHogi are most sensitive to kinetic rate constant changes involving the MAPK cascade genes, and are least affected by variations in the phosphorelay components. This theoretical prediction is qualitatively reproduced in multiple analyses that utilize different sensitivity measures (Figure 2.4.4). . ...... ............... Sensitivity of steady state Hog1 phoshorylation Sensitivity of initial Hogi phoshorylation rate 0.7level 0.6 2.0 0.5 1.6 0.4 1.2 0.3 0 0 Sini Ypd1 Sski Sk2 0.40 Pbs2 SinI Ypd1 Seki Sk2 Pbs2 Figure 2.4.4 Calculating sensitivity metrics from surface plots using modified standard deviation metric shown in Equation 2.2. 2.4.2 Relating mutational robustness to local biochemistry One possible mechanism that could explain the pattern of mutational robustness we observe experimentally is that the biochemical circuitry of the phosphorelay network renders the terminal phosphorelay protein insensitive to changes in kinetic parameters of its upstream pathway components. To mathematically determine the contribution of this effect, consider a chain of signaling proteins where the steady state phosphorylation level of any cascade protein consists of a basal phosphorylation level independent of pathway activity, and an additional component that is inducible by the steady state phosphorylation level of its immediate upstream activator: x =x 2+ - ( Here, x1 and x2 [2.3] x) 8x1 are the basal phosphorylation levels of the 1 "t and 2 nd proteins in the cascade, and primed symbols represent the total protein phosphorylation levels, while the partial derivative denotes the amount of phosphorylated 2 nd proteins derived from every phosphorylated l' protein. Extending these equations for the 3 rd protein in the cascade yields: x' =x+ X3 3 (x -x 2 ) x 2 2 2 [2.4] Substituting [2.3] into [2.4] we obtain: 3x 2 x2 (xx+ x_) d 8x1 [2.5] Extending the analysis for the jth protein in the cascade, we obtain: x'= x (x,- x 1 ) i=2 [2.6] x_ From [2.6] it is clear that the biochemical details of signal transmission are buried mathematically in the chain of derivatives i.e. they represent how the activity of the cascade protein furthest upstream is transduced into changing the activity of the jth cascade protein. For example, the contribution of the kth protein to the chain of derivatives arises from two factors i.e. the effect of the (k- 1 )th protein on the activation of the kth protein and the effect of the k protein on the activation of the (k+1)* protein: axk+ axk_ a9Xk a9Xk- where we term k = Xk+l = k [2.7] ak-1 the steady state throughput of the kth protein. The central claim of the throughput analysis is that the sensitivity of k to changes in parameters describing the k* protein can predict to what extent sequence changes in the kth protein will be tolerated by the system. An important corollary to this claim is that if k is invariant under parameter variations, then sequence changes in the kth protein will not affect signaling unless the sequence changes completely inactivate the protein altogether. To put this analysis into effect, we used a simplified model of the HOG pathway [Klipp 2005]: d[Sln1P] = k,( Ut) 2 [Sln1] + k-2[YpdlP][Slnl] - k 2 [Ypdl][SIn1P] dt O(t) d[YpdlP] = k 2 [Ypdl][Sln1P] k2[YpdlP][Slnl] - k3 [YpdlP][Sskl] - dt d[SsklP] - k [Ypd1P][Sskl] k[pl[sk - k_[SskP] k[sl dt d[Ssk2P] =k 4 [Ssk2][Sskl] - k 4 [Ssk2P] d dt d[Pbs2P] = k5[Pbs2][Ssk2P] - k_,[Pbs2P] dt d[Hog1P] = k6[Hogl][Pbs2P] - k6[Hog1P] dt [2.8] [. Since the signaling dynamics are fast relative to the osmotic pressure variable, separation of timescales allows one to treat the signaling system as if it were in steady state at every moment in time (the signaling pathway adiabatically follows the osmotic pressure dynamics, readjusting itself to the osmotic pressure variable at every point in time). To determine the effect of local biochemistry on gk, we examined the two most architecturally distinct proteins i.e. Pbs2 and Ypdl whereby Pbs2 is a kinase sandwiched between similar kinase proteins, while Ypdl is a phosphotransfer protein sandwiched between similar phosphotransfer proteins. ad[HoglP] k5 k_,kok-6 Pbs2T Hog1T b Ssk2P] 2[k5k6[Ssk2P] Pbs2T+ k-6(k5-Ssk2P]+k From this expression, we observe that 4Pbs2 [2.9] 5) 2 ' depends on Pbs2 interaction parameters i.e. phosphorylation rate of Pbs2 and Hogl etc. Changes in Pbs2 sequence can alter these rates and affect the steady state throughput, and can impact Hog1 phosphorylation levels. On the other hand, the throughput of Ypdl is: k,( H(t) d[Sskl] Ypd [Slnl] _ 0 (t) 2 [2.10] ............... III Remarkably, 4Ypd1 .IN! is independent of Ypdl parameters. This implies that, as a direct consequence of the local architecture of the network of biochemical reactions, Hogl phosphorylation is shielded from potential changes in Ypdl rate constants. 2.5 Experimental evolution design To test our theoretical result, we devised an experimental evolution strategy where we harnessed naturally occurring genetic variation, and imposed a strong selective pressure on HOG signaling. Then, we compared the adaptive genetic variants found in the evolution experiments with our theoretical predictions. doxycycline ab MM O2 rtTA EE E HogPathwaymadce I WTgrawth Consftbte Hog I PTETO7 P YFP adon NRD1 . HOG1 !Wow goth Figure 2.5.1 Artificially hyperactivating HOG pathway by underexpressing YPD1. a) In media containing no doxycycline, Ypdl is greatly underexpressed, leading to constitutive activation of the pathway and thus a greatly reduced growth rate. b) By placing the endogenous Ypdl gene under the transcriptional control of the Tet promoter, we could tune its expression level. To design our selection experiment, we took advantage of the knowledge that deletion of the YPD1 gene in the HOG pathway leads to hyperactivation of the pathway and subsequent cell lethality [Posas 1996]; the logic of the experimental intervention is shown in Figure 2.5. lA. Thus, shown in Figure 2.5. lB, we placed YPD1 under the control of a TetO7 promoter where we could induce its expression with doxycycline and control the I . ....... .................................... degree of activation of the pathway. In addition, a cyan fluorescent protein was placed under a second TetO7 promoter to serve as an indirect readout for YPD1 expression. From growth rate measurements at different doxycycline concentrations, we found that the cells suffered a severe growth defect at low doxcycline concentrations, where YPD1 expression was repressed (Figure 2.5.2). To determine if the HOG pathway was activated under YPD1 underexpression, we imaged the cells under the microscope to measure their Hogl-YFP and Nrdl-RFP signals. We found that Hogi was predominantly localized in the nucleus, therefore confirming that the pathway was indeed hyperactivated under YPD1 underexpression (Figure 2.5.3). In contrast, Hog1 was uniformly distributed throughout the cytoplasm in cells with high YPD1 expression and in wild-type cells. 0.4- 0.3- 0.2- 0.1 I -4 -3 -2 -1 0 log 10(doxycycline) [pg/mL] Figure 2.5.2 Growth rate of ancestor strain as a function of doxycycline present in the media. The experiment shows that as long as the the doxycycline concentration remains below 0.01 pg/mL, the strain exhibits a severe growth rate deficit of roughly 70%. Because Hogl activation induces the expression of GPD1 and GPP2, which encode proteins responsible for glycerol synthesis [Albertyn 1994], we assessed the transcriptional readout of the signaling activity by measuring intracellular glycerol. We found that cells underexpressing YPD1 had at least two-fold higher intracellular glycerol concentration than cells with high YPD1 expression (Figure 2.5.4), which was consistent with our observation that the pathway was hyperactivated under YPD1 underexpression. Ancestor with doxycycline Nrdl-RFP Ancestor without doxycycline Hog1-YFP Nrdl-RFP Hogi-YFP Figure 2.5.3 HogI nuclear enrichment in ancestral strain with and without doxycycline. Fluorescence microscopy confirms that, consistent with the idea that underexpression of Ypdl leads to constitutive pathway activation, in our strains grown in the absence of doxycycline there is a notable basal enrichment of nuclear Hog 1. Ancestor with doxycycline MPAncestor without doxycycline 0 0.01 002 0.03 0.04 005 006 [glyceroissyjcell (OD54OD ) Figure 2.5.4 Glycerol production in ancestor strain with and without doxycycline. In the "ancestor" strain (which contains the Tet07-Ypdl construct), intracellular glycerol levels show a nearly 3-fold increase when doxycycline is withheld from the media. This is consistent with the idea that withholding doxycycline hyperactivates the HOG pathway, which is known to upregulate glycerol production in the cell upon activation. Finally, as shown in Figure 2.5.5, by measuring Hogi nuclear enrichment at different doxycycline levels, we further established that the growth rate was inversely correlated with Hogi nuclear accumulation. 1.6C 0 E C Z 1.5 Z 1.3 -4 -2 -1 -3 logl [doxcycline] Ipg/mI 0 1 Figure 2.5.5 Hogl nuclear enrichment as a function of doxycycline in the "ancestor" strain. 2.6 Rapid adaptive evolution of yeast cells underexpressing YPD1 We evolved nine independent lines of the yeast strain underexpressing YPDJ each with a population size on the order of 107 cells, and monitored their mean population growth rates using turbidostats [Acar 2008]. Rapid adaptation occurred after merely five days, shown in Figure 2.6.1, and qualitatively similar adaptation dynamics were observed in the nine experiments. The dynamics revealed three distinct regimes. During the first 14 hours (phase I), we observed a transient decrease in the growth rate as a consequence of the dilution of the Ypd1 proteins due to cell division and degradation. Phase II exhibited the lowest growth rate (-0.20 ± 0.05 hr-), and this quasi-steady state lasted for about 26 hours. The growth rate rapidly recovered within the next 36 hours (phase III) before eventually reaching a steady-state level similar to that of the ancestor under the unstressed condition. At the end of the evolution experiments, a small aliquote of the turbidostat culture was plated on selective plates and five randomly selected single colonies were isolated from each of the nine adapted populations for further analyses. Importantly, the evolved populations maintained their growth advantage when placed under the same selective pressure in media without doxcycline (transferred from media without doxycycline to with doxcycline and subsequently to without doxycycline again), indicating that their phenotypic changes were stable (Figure 2.6.2). 0.1' 0 1 2 3 4 Time after doxycycline removal [Day] 5 Figure 2.6.1 Time course of evolutionary dynamics. 2.6.1 Restoration of basal wild-type Hog1 activity To determine if the hyperactivation of the HOG pathway had been resolved, we measured the evolved strains' Hogi nuclear enrichment and intracellular glycerol in two randomly selected colonies out of five from each of the nine adapted populations. In 17 out of 18 evolved strains, both Hog1 nuclear enrichment and intracellular glycerol content had restored to levels comparable with the ancestor in the unstressed condition (Figures 2.6.3 and 2.6.4). Thus, we established that the hyperactivation of the HOG pathway had been alleviated in almost all evolved strains. . k * EvlIved straios - 0oxycycIn4 a a -- 4 AtIcestor + doxcycline A cestor dcxcycIIne -4 0 0.1 0.2 0.4 0.3 0.5 0.6 0.7 Growth rate (hr4) Figure 2.6.2 Restoration of growth rate levels to ancestral state. As can be seen by comparing the pink bars to the green bar and yellow bar, the evolved cells match the growth rate in the ancestral strain in the presence of doxycycline, well above the rate in the absence of doxycycline. * Evolved strains without doxycycline Ancestor with doxcycline * Ancestor without doxcycline -ammes--I I - Imm INUMNEF---i 0.01 0.02 0.03 0.04 0.05 [glyceroljt,j/cell (OD/ODa00 ) Figure 2.6.3 Restoration of glycerol levels to ancestral state. Consistent with the hypothesis that growth rate restoration followed downregulation of HOG pathway activity, we see a drop in amount of intracellucular glycerol. To further assess if the evolved strains were capable of mediating hyperosmotic shock recovery, we simultaneously measured both the dynamics of their Hogi nuclear enrichment and cellular volume, a proxy for turgor pressure in response to hyperosmotic shock after adding 0.6 M NaCl. Interestingly, for a majority of the evolved strains, the amplitude and dynamic changes in Hogl nuclear accumulation, shown in Figure 2.6.4, and cellular volume, shown in Figure 2.6.5, were similar to those of a wild-type response. 0 10 20 30 40 Time [minutes] 50 60 Figure 2.6.4 Restoration of pathway dynamics to near ancestral behavior. With the exception of a single trace from an evolved strain, all evolved strains (data shown in light pink) exhibit step response trajectories that are very similar to the ancestral trajectory in the presence of doxycycline. In contrast, 4 of the evolved strains displayed drastically different dynamical Hogi signaling behaviors. The amplitude and activation rate of Hogl in these strains were significantly reduced compared to wild-type (Figure 2.6.4). Despite these gross differences in signaling dynamics, however, we still observed restoration of turgor pressure and the rate of volume recovery was not significantly different from wild-type (Figure 2.6.5). These data together indicated that both signal transduction and Hogl- mediated transcriptional regulation of glycerol-producing factors responsible for turgor pressure recovery were at least partially functional in most of the evolved strains. 0. L t E >. -0.2- -0.4 0 1'0 20 30 40 50 60 Time [minutes] Figure 2.6.5 Restoration of volume recovery dynamics for the majority of the evolved strains, further bolstering the data from Figure 2.6.4. 2.6.2 Transcritpional regulation of YPD1 is not upregulated in the evolved strains Mutations in the synthetic transcriptional activator rtTA have been found to be responsible for the rapid adaptation of a synthetic gene circuit (Pando and van Oudenaarden, unpublished results). To determine if YPD1 expression had increased in the evolved strains, we measured CFP levels which served as a proxy for Ypd1 protein levels. All eighteen evolved strains showed low CFP intensities similar to that of the ancestor under no doxycycline conditions (Figure 2.6.6), thus indicating that the adaptive molecular changes most likely did not occur in the rtTA gene. 0 100 200 300 400 Mean CFP intensity (-Ypdl) (Au) Figure 2.6.6 CFP data suggests that the evolved strains did not alter the properties of rtTA in order to effect their recovery from pathway hyperactivation. 2.6.3 PBS2 and SSK2 are preferentially mutated in independent evolution experiments To identify the candidate molecular changes that led to the evolved phenotypes, we performed a genome-wide screen for point mutations in the evolved strains. Full genomic sequences of the ancestral strain and five of the evolved strains were obtained using the Illumina sequencing platform. Data on the quality of the sequencing results, such as the number of reads and the genome coverage, can be found in Table 2.6.1. A total of 517 single-nucleotide polymorphisms (SNPs) were found in the ancestral strain compared with the published Saccharomyces cerevisiae genome. 508 of these SNPs (-98%) appeared in all 5 evolved strains and were thus likely to be present in the ancestral strain (data not shown). At most three statistically significant mutations were identified per evolved strain (p-value < 0.05, )? test), which are highlighted in Table 2.6.2. We found that 5 of the confirmed mutations mapped to the coding sequence of the genes in the MAPK cascade, i.e. PBS2 and SSK2. Total number of filtered reads 106 (a) 8.5 % of filtered reads aligning to the genome (b) 95.6 Mean genome coverage (c) 24 % of genome coverage (< 5 reads) 6 9 91.2 E2 8.4 90.64 27 22 4 3 E3 8.2 85.2 20 8 E4 7.7 E5 8.9 79.1 86 21 27 5 6 Strain Ancestor El (a) Reads that passed the Illumina ELAND pipeline filter. (b) A maximum of two mismatches per read was set in the alignment process using the MAQ software. (c) The average number of reads aligned to each nucleotide position in the genome. Table 2.6.1 Summary of sequencing depth and coverage for both the ancestral strain and 5 evolved strains sent for Illumina sequencing. Whole-genome sequenced evolved strain Chro Genome position; Ancestral nucleotide Evolved nucleotide ORF Impact (a) 2 533725;1 T C - - 4 247140; -1 C A A158E 10 179973; -1 T G YDL119 C PBS2 E3 10 3 12 178832; -1 49444; -1 162383; -1 G G A E4 E5 14 14 681259; -1 681040; -1 C C A A T T T PBS2 PDIl SSK1 SSK2 SSK2 G423D G260E 1504F P1393L P1466L El E2 Y43D (a) Impact represents either synonymous or non-synonymous mutation. Notation A158E indicates that amino acid A at gene position 158 in the ancestral strain has been changed to E in the corresponding evolved strain. Table 2.6.2 Single nucleotide polymorphisms detected from whole genome sequencing of evolved strains. In virtually all cases, the detected SNP was found to affect a HOG pathway gene. ..................... To determine how prevalent mutations in HOG pathway genes were in the evolved strains, we sequenced all six genes in the pathway including their promoter regions for the remaining 40 strains. Unexpectedly, all of the 45 strains except 5 contained a single point mutation in one of the genes in the pathway (Figure 2.6.7). 5 a No. of evolved strains with mutations in the Hog network No. of evolved strains with no mutations in the Hog network Figure 2.6.7 Mutations in evolved strains are predominantly in HOG pathway genes. Remarkably, 40 out of 45 evolved strains contained mutations within the HOG pathway proteins themselves. Evolved strains isolated from the same population harbored different genetic changes, thereby confirming that the populations at the end of the evolution experiments were not isogenic (Figure 2.6.8). We found that, among the pathway genes, PBS2 and SSK2 mutations were the most highly represented across a majority of the nine independently adapted populations, which were largely consistent with the computational robustness analysis. In addition, we observed two incidences in which identical PBS2 mutations were found in independent experiments. To account for the different mutational target sizes across pathway genes, we normalized for every gene, the number of unique mutations against gene length. The trend was robustly reproduced (Figures 2.6.9) with the MAPK cascade genes PBS2 and SSK2 being the most frequently mutated. ......................... . ... ................ ...... ........... ................................ --......................... .. - Number inside bar indicates fraction of unique gene mutations -* and " represent identical mutations inindependent experiments &, C .C u 1 0.8 1/2 Unknown SLN1 SSKI YPD1 SSK2 HOGI *PBS2 2/3 0.6 C0.4 C 0 2/2 0.2 C 0 2 3 4 5 6 7 8 9 Evolution experiment Figure 2.6.8 Distribution of genetic changes in evolved strains corresponding to (A) across the nine experiments. Indicated inside bars are the fractions of unique gene mutations observed in individual experiments. Notations * and * * represent two particular mutations which were found in independent experiments. 0.005 0.004 0.003 0.002 M~ 0.001 Th A SLN1 YPD1 SSK1 SSK2 PBS2 HOG1 Figure 2.6.9 Characterizing the spectrum of mutations according to target identity. Remarkably we find that the MAPK proteins Ssk2 and Pbs2 are the most readily targeted proteins in the pathway, precisely as expected from the systematic complementation and computational sensitivity analyses. We identified a total of 25 unique mutations and all except one were missense mutations, and more than half of them were in the protein kinase domains, which are highly conserved. The mutations are catalogued in Table 2.6.3. When examined more closely, the mutations typically occur in protein domains relevant to signaling. For example, 4 of the 7 mutations in the SSK2 gene occur in the protein kinase domain, while 2 of the remaining mutations take place near or in the domain that binds the upstream activator Sskl. Taken together, the above findings revealed that PBS2 and SSK2 were preferentially and repeatedly mutated in independent evolution experiments, and suggested that the observed phenotypic changes most likely arose from genetic changes within the HOG pathway. A SSKI SSK2 PBS2 B 1504F Close to response regulator receiver domain V402A Ssk1 binding domain, essential for Ssk2 activation W427C Essential for Ssk2 activation C1172Y Unknown P1393L Kinase domain P1466L Kinase domain G1471V Kinase domain W1557C Kinase domain Y43D Docking site for Ssk2 R61L Docking site for Ssk2 G423D Kinase domain G509S Kinase domain M526R Kinase domain R640 to STOP NLS? 0.5 A =Evolved strain B =Ancestral strain with endogenous gene replaced with mutant allele 0.2 Table 2.6.3 Cataloguing the mutations in molecular detail, including the domains the mutations occur in. 2.6.4 Mutations in PBS2 and SSK2 are mainly responsible for the down- regulation of the hyperactive signaling and improved fitness To test whether PBS2 and SSK2 mutations account for the adaptive phenotype, the endogenous gene in the ancestral strain was replaced with 13 of the unique mutant alleles ("transformed strains"), and their growth dynamics were compared to those of the ancestor, and either a PBS2A or a SSK2A strain. These 13 mutant alleles were randomly selected to broadly represent mutations across various protein domains. Unlike the ancestral allele, almost all the mutations conferred a significant growth advantage when the cells were transferred from media with doxycycline to without doxycycline (Table 2.6.3). The growth increase conferred by the single mutations matched the fitness advantage of most of the evolved strains, confirming that PBS2 and SSK2 mutations were primarily responsible for the improved fitness. For a majority of the transformed strains, the growth rates were similar to that of their respective gene deletion strain i.e. PBS2A or SSK2A under no doxycycline conditions i.e. (0.38 ± 0.07) hr' and (0.42 ± 0.05) hr'. Since HOG signaling was not completely abolished in the evolved strains, these data further supported that the mutations cause a partial loss-of-function of PBS2 and SSK2, thereby mitigating signaling hyperactivation. 2.6 Discussion Studies on the robustness of cellular phenotypes to gene expression changes have been greatly facilitated by experimental techniques which allow quantitative manipulation of gene expression [Batchelor 2003] and [Moriya 2006]. In contrast, there is no simple experimental strategy to comprehensively assess the robustness of network function to coding sequence variation of its component genes. For instance, to explore amino acid substitutions at only a few positions of a protein would involve generating thousands of variant proteins, and measuring their effects on system outputs. Alternatively, a comparative method utilizes existing natural genetic variation, and infers genetic robustness by comparing the structure and function of cellular networks across closely related species [Tanay 2005]. But this approach is limited by the inexact knowledge of the environments to which the organisms adapted. Here, we show how a quantitative understanding of genetic robustness can be achieved using a combination of theoretical and experimental approaches. In particular, our work demonstrates the feasibility and promise of applying experimental evolution to the study of genetic robustness. By manipulating either the environmental conditions or the genotype of the organisms, specific hypotheses can be tested based on the evolutionary outcomes. Through a computational analysis, we find that signaling is most affected by kinetic parameter changes in the MAPK cascade genes, and yet, it is highly robust to changes in the upstream phosphorelay components. To test this, we induce hyperactive signaling in yeast cells, and harness the combined forces of evolution and natural selection to sieve out adaptive genetic variants that can significantly affect signaling, and restore fitness. The model predicts that mutations in the upstream phosphorelay genes have a minimal effect on network behavior, thereby suggesting that genetic changes in these genes would be effectively neutral. The theoretical results are largely consistent with the evolutionary outcomes, where none of the evolved strains out of a total of forty from nine independent experiments had mutations in the phosphorelay genes SLN and YPD1, and instead, almost all the mutations were in either PBS2 or SSK2. As with many studies attempting to predict the outcome of evolutionary experiments, due to the inherently contingent nature of changing allele frequencies via natural selection, some important caveats apply in assessing these results. Although our results have shown that the HOG pathway is less sensitive to kinetic rate changes in the phosphorelay system than the MAPK system, and although they have also shown that the MAPK system is preferentially targeted by natural selection to downregulate the activity of the pathway when a population of budding yeast cells is challenged to do so, it is possible that these are unrelated. The MAPK system, being positive regulators of Hog1 translocation to the nucleus, needs to suffer loss-of-function mutations to downregulate pathway activity, which intuitively are thought to be easier to obtain than the gain-of-function mutations that the phosphorelay system proteins would need to similarly downregulate the pathway. It should be noted, however, that there is a loss-of-function path for the phosphorelay system to downregulate Hogl translocation: Sln1 could fail to accept phosphate groups from Ypdl, which would increase the phosphorylated form of Sskl and lower pathway activity - this is a mutation we do not observe. Furthermore, mutations in HOG1 can significantly affect signaling, however, these were rarely found. Because Hogi genetically and physically interacts with a large number of genes and proteins, it is quite likely that the pleiotropic costs outweigh the benefits in signaling changes, thereby rendering these "solutions" much less probable. Notably, we observe that rapid restoration of signaling and fitness can be achieved solely via a singlenucleotide mutation. And interestingly, all the mutations were found in the protein coding regions. [mention that this might be because of the large selective pressure in this system] Because cis-regulatory regions tend to exhibit greater plasticity than coding sequences [Borneman 2007] and [Odom 2005], it appears more likely that a single point mutation would affect protein function than drastically alter gene expression. Finally, our results show how a relatively simple eukaryotic signal transduction pathway could have evolved its biochemical circuitry to allow for genetic robustness and evolvability. One can rationalize the theoretical result by recognizing that the MAPK cascade and the phosphorelay chain operate via distinctively different modes of signaling. The MAPK cascade utilizes a catalytic signaling mechanism, that is every phosphorylated molecule goes on to phosphorylate multiple downstream molecules, and thus the cascade output (amount of phosphorylated Hogl) is ultimately dependent on all the MAPK cascade component parameters. By contrast, the phosphorelay chain operates via stochiometric signaling where the output of the chain is determined entirely by the influx and efflux of phosphate groups [Shinar 2007], rather than the rate constants of components within the phosphotransfer relay. Thus, changing those parameters should not affect signaling, consistent with what we have observed experimentally. Our results suggest that the nature of biochemical interactions within a network can significantly shape the space of targets that natural selection can act upon. 2.7 Methods Strain background and construction Our haploid ancestor strain (DMY028) was derived from the DMY017 strain [Muzzey 2009], the only difference being that it contained a plasmid bearing two TetO7 promoters, one of which drives the expression of CFP, while the other controls YPD1 expression. The mutant strains referred to in this study were similarly derived from the DMY017 strain, except that the endogenous genes in the Sln1 branch of the HOG pathway were singly knocked out and replaced with its corresponding orthologs from various yeast species. Firstly, the endogenous genes were singly knocked out and replaced with the Candida albicans URA3 gene using the pAG60 plasmid (Euroscarf). SLN1 and YPD1 gene deletions are lethal due to the hyperactivation of the pathway. To circumvent this, we knocked out these genes using a cassette containing both the C. albicans URA3 gene and the Hog1 phosphatase PTP2 placed under the control of the ADH1 promoter. The orthologous genes from various yeast species were stitched to the 500-bp S. cerevisiae upstream and downstream gene flanking sequences using overlap extension PCR. These final constructs were then transformed into the endogenous gene knockout strains described earlier, and single colonies were selected for the absence of URA3 expression on 5-FOA plates. All integrations were subsequently confirmed by sequencing. Growth and media conditions Unless otherwise stated, all experiments were performed on exponentially growing cell cultures in synthetic dropout media with the appropriate amino acid supplements at 30 'C. The ancestral and evolved strains were grown consistently in 0.4 M NaCl for all experiments, except when their signaling abilities were analyzed upon a hyperosmotic shock of 1 M NaCl. In addition, all experiments involving the evolved strains were performed in the absence of doxycycline. Prior to the evolution experiment, the ancestral strain was grown overnight with doxycycline and the culture media was replaced with media without doxycycline before propagating them in the turbidostat [Acar 2008]. In experiments where cells were treated with doxycycline, a 5 pg/ml concentration was used. Glycerol assays Intracellular glycerol levels were measured using the Free Glycerol Reagent Kit (Sigma) as described [Muzzey 2009]. For details regarding the method and cell preparations, see the Supplemental Data. Fluorescence microscopy and image analysis Cell preparation and immobilization, and image acquisition and segmentation were performed as described [Mettetal 2008]. For our signaling experiments involving mutant strains with the orthologous pathway proteins, we corrected for any possible effects from outside the HOG pathway by measuring signaling in the respective pathway gene knockout strains in response to the same hyperosmotic shock ("basal signal"), and we subtracted this basal signal from that of the mutant strain's mean Hogl trace. In addition, the reported Hogl nuclear enrichment here represents the measured signal subtracted by the nuclear enrichment level prior to hyperosmotic shock. Whole-genome sequencing Genomic DNA (gDNA) was extracted from 10 ml of stationary phase cultures using a standard protocol with a few modifications noted below [Hoffman 1987]. Three consecutive phenol/chloroform/isoamyl-alcohol extractions were performed to reduce protein contamination. RNA contamination was reduced by treating the gDNA samples with 70 ng/pl affinity-purified RNAse A (Ambion) for 1 hour at 37 "C. A final chloroform extraction was performed to remove phenol contamination prior to ethanol precipitation. The final gDNA yield was quantified using a ND-1000 spectrophotometer (NanoDrop). Genomic libraries for whole-genome sequencing were prepared as directed (Illumina). Image analysis, base calling and sequence alignment were performed according to the Illumina Genome Analyzer pipeline. The Illumina pipeline tool "eland" was used to uniquely align 36 bp reads to the S288c yeast reference genome [Cherry 1998] allowing a maximum of 2 mis-matches per read. Subsequent data analysis was carried out using the MAQ software (http://maq.sourceforge.net/), where the filtered reads were mapped to the reference genome and assembled to create the consensus genomic sequence, and to detect SNPs. The same procedure was carried out for the synthetic constructs i.e. PMro2 -rtTA and PTeto 7 YPD1. The MAQ-generated SNPs were further filtered upon analyzing criteria such as average chromosomal depth and repetitiveness using custom-written scripts. Finally, we determined the statistical significance of the SNPs by performing a x test on the distributions of nucleotide bases of the reads obtained at each SNP position for the ancestral and the evolved strain. A randomization test for goodness-of-fit was carried out in cases where there were fewer than five reads. We obtained between seven to nine million Illumina-pipeline filtered reads for each sequencing attempt, 80-95 % of which aligned to the S288c reference genome. Within the non-repetitive genomic regions, an overall mean depth of around 20 reads per nucleotide position was obtained, with less than 10 % of nucleotide positions having fewer than five reads (Table 2.6.1). 85 86 Chapter 3 Robust yet tunable regulatory elements: the case of microRNA 3.1 Summary MicroRNAs (miRNAs) are short, highly conserved non-coding RNA molecules that repress gene expression in a sequence-dependent manner. Each miRNA is predicted to target hundreds of genes [Lewis 2005, Selbach 2008, Baek 2008, Friedman 2009], and a majority of protein-coding genes are predicted to be miRNA targets [Friedman 2009, John 2004]. Bulk measurements on populations of cells have indicated that, although pervasive, repression due to miRNAs is on average quite modest (-2-fold) [Selbach 2008, Baek 2008, Bartel 2004]. Information on the magnitude of repression in single cells, however, has been lacking. Here we perform single-cell measurements using quantitative fluorescence microscopy and flow cytometry to monitor a target gene's protein expression in the presence and absence of regulation by miRNA. We find that while the average level of repression is modest and in agreement with previous population-based measurements, the repression among individual cells varies dramatically. In particular, we show that regulation by miRNAs establishes a threshold level of target mRNA below which protein production is highly repressed. Beyond this threshold, there is a regime in which expression responds ultrasensitively to target mRNA input until reaching high enough mRNA levels to almost escape repression by miRNA. We constructed a mathematical model describing repression of target gene expression by both non-catalytic and catalytic activity of miRNA. The model predicted, and experiments confirmed, that the ultrasensitive regime could be shifted to higher target mRNA levels by transfecting additional miRNA or by increasing the number of miRNA binding sites in the 3' UTR of the target mRNA. The ultrasensitive transition is not observed when the miRNA targets a perfect complementary site that can undergo catalytic cleavage. These results demonstrate that even a single species of miRNA can act as a switch to effectively silence gene expression and as a fine-tuner of gene expression. 3.2 microRNA background MicroRNAs regulate protein synthesis in the cell cytoplasm by promoting target mRNAs' degradation or inhibiting their translation. Their importance is suggested by their abundance, with some miRNAs expressed as high as 50,000 copies per cell [Lim 2003]; by their sequence conservation, with some miRNAs conserved from sea urchins to humans [Grimson 2008]; and by their number of targets, the majority of protein-coding genes [John 2004]. miRNAs can regulate a large variety of cellular processes, from differentiation and proliferation to apoptosis [Chen 2004, Yi 2008, Sluijter 2010, Esau 2004, Le 2009, Cimmino 2005, Song 2009, Sood 2006, Li 2005, Makeyev 2007, Bernstein 2003]. Further, miRNAs also confer robustness to systems by stabilizing gene expression during stress and in developmental transitions [Li 2009, Li 2006]. 3.3 Two-color assay to measure regulation via microRNA Despite the evidence for the importance of gene regulation by miRNAs, the typical magnitude of observed repression by miRNAs is relatively small [Friedman 2009], with some notable exceptions such as the switch-like transitions due to miRNAs lin-4 and let7 targeting the heterochronic genes lin-14 and lin-41 respectively in Caenorhabditis elegans [Bagga 2005]. Importantly however, most of the previous studies of regulation by miRNAs in mammalian cells have measured population averages which often obscure how individual cells respond to signals [Raj 2008].To assay for miRNA activity in single .. ....... .............. . mammalian cells, we cloned a two-color fluorescent reporter construct that permits simultaneous monitoring of protein levels in the presence and absence of regulation by miRNA, depicted in Figure 3.3.1. The construct consists of a bidirectional Tet-inducible promoter driving two genes expressing the fluorescent proteins mCherry and eYFP tagged with nuclear localization sequences. The 3' UTR of mCherry is engineered to contain N binding sites for miRNA regulation. In the initial experiments, the inserted sites are recognized by miR-20, which is expressed endogenously in Hela cells along with its seed family members miR-17-5p and miR-106b. The 3' UTR of eYFP is left unchanged so that it can serve as a reporter of the transcriptional activity in a single cell. pTRE-Tight I NLS-eYFP 3'-UTR T N miR-20 binding site(s) , (TACCTGCACTCGCGCACTTTA)N 3'-UTR Figure 3.3.1 A synthetic two-color reporter construct for measuring miRNA mediated gene regulation in single cells. The construct consists of a bidirectional tetracycline-responsive promoter that drives the transcription of two fluorescent reporter proteins: eYFP and mCherry. Nuclear localization signals (NLS) are fused to the fluorescent reporters to facilitate image processing. We fuse N miR-20 binding sites to the 3'-UTR of mCherry to measure the effects of miRNA-mediated regulation. 3.3.1 Control experiments establishing eYFP as a transcriptional readout In order to confirm that the bidirectional Tet promoter was indeed driving transcription of both fluorescent reporters symmetrically, we performed a series of control measurements with the N = 0 construct. First, as shown in Figure 3.3.2a, we performed quantitative RTPCR to measure the mRNA levels of mCherry and eYFP in bulk populations and observed that the two reporter transcript abundances were roughly the same. However this method does suffer from the drawback that it could in principle obscure the picture for an individual cell: in principle, it could have been possible that although on average the two reporters are expressed at the same level, any given cell either expresses eYFP or mCherry but not both. In this case eYFP would not be a faithful reporter of mCherry transcript levels - indeed it could be the opposite. Thus secondly, as shown in Figure 3.3.2b, we examined the raw joint mCherry-eYFP distributions for N= 0 to ensure that the single cells clustered around a line of slope a = 1. b 1 fluoemne data 400 C) 300 . NO0.4I~~~i P 200 - ~0.2 0 Ox eYFP 0 Ox mCherry 100 200 300 eYFP (arb. units) 400 Figure 3.3.2 Control experiments used to confirm idea that eYFP can act as a faithful reporter of mCherry transcriptional activity in individual cells. a) RT-PCR signals from N = 0 eYFP and mCherry samples are normalized to the N = 0 eYFP value, showing that the mCherry transcript level is very similar to the eYFP transcript level. b) The joint mCherry-eYFP single cell distribution agrees with the result of panel a for the case of individual cells as well. 3.4 microRNA mediated repression generates gene expression thresholds We constructed cell lines that stably expressed the fluorescent reporter construct with either a single bulged miR-20 binding site or no site in the mCherry 3' UTR. The levels of eYFP and mCherry protein were measured for single cells using quantitative fluorescence microscopy. Arranging individual cells according to their eYFP expression level, as shown in Figure 3.4.1, we observed that cells whose mCherry 3' UTR lacks miRNA binding sites had a concomitant increase in mCherry expression. This indicates I . ................. that in the absence of miRNA targeting of the mCherry mRNA, the level of expression of eYFP is directly related to the level of expression of mCherry. eYFP C| mCherry 0|| 1.3 0.8 1.2 1.2 1.7 2.0 1.1 1.0 1.9 ratio eYFP mCherry 0.0 0.0 0.0 0.1 0.1 0.1 0.6 1.0 0.9 ratio Figure 3.4.1 Arranging single cells according to eYFP expression level reveals gene expression thresholding by miRNA. Cell outlines are shown in yellow. Below each representative single cell is the ratio of the mean pixel intensity in the mCherry channel to the eYFP channel. However, in cells with a miR-20 site in the mCherry 3' UTR, the eYFP fluorescence initially increases with no corresponding increase in mCherry expression level, seen in Figure 3.4.1 (lower panel). To capture this behavior quantitatively, we measured joint distributions of mCherry and eYFP levels in single cells and binned the single cell data according to their eYFP levels (see Figure 3.9.1 for a more detailed outline of the binning procedure). Within each eYFP bin, we calculated the mean mCherry level; this process is outlined in Figure 3.4.2. We refer to the binned joint distribution as the transfer function. As suggested by the representative single cells shown in Figure 3.4.1, the transfer function in Figure 3.4.3 shows a threshold-linear behavior in which the mCherry level, which represents the target protein production, does not appreciably rise until the curve reaches a threshold level of eYFP. 100 A N = 0 A N =A1 EN= AAAA 25 Ao 0A AAAA AAA A &A&&&AA A AA A A 0 AAAAA A 4"*A1AAzAA 0 25 50 75 100 eYFP (a.u.) Figure 3.4.2 Transfer function relating eYFP to mCherry levels. As expected from representative single cells shown in Figure 3.4. 1, the Tet promoter must operate above a threshold transcriptional activity in order to escape from robust silencing due to miRNA-mediated repression. 3.5 Generating thresholds without feedback We developed a simple mathematical model of miRNA-mediated regulation that could reproduce the nonlinearity in the above transfer function. This model, depicted in cartoon form in Figure 3.5.1, is similar to previous models [Elf 2003] used to describe proteinprotein titration [Buchler 2008] and small RNA (sRNA) regulation in bacterial systems [Levine 2007]. It describes the concentration of free target mRNA (r) subject to regulation by miRNA (m). We assume that only r can be translated into protein. Experimentally, r corresponds to the mCherry signal, while runargetedcorresponds to the eYFP signal. The core of the model involves the binding of r to m to form a mRNAmiRNA complex and the release of m from the complex back into the pool of active miRNA molecules either with or without the accompanying destruction of r. We assume that the total amount of miRNA is fixed; experimentally we observe no decrease in the miR-20 level beyond experimental uncertainly as a function of eYFP (Figure 3.5.2). 3.5.1 Mathematical framework In order to describe our data, we devised a simple mathematical model of the biochemistry of miRNA-mediated gene regulation. The model is largely similar to models of protein-protein interactions proposed by Buchler and Louis as well as models of sRNA regulation of expression proposed by Levine et al. The model describes the time evolution of the target mRNA free of miRNA (r) and the target mRNA bound by miRNA (r*) and assumes that the turnover of miRNA is slow compared to the timescale of gene expression so that it can be held constant. The model consists of the following set of coupled, first-order, ordinary differential equations and the conservation relation for miRNA: dr -=k dt dr* -= R -konr kr [miRNA] + koffr [miRNA]-kffr Yr YRr *3.11 r [3.2] dt [miRNAIT = [miRNA] + r* [3.31 For the sake of simplicity, we assume that no translation can occur from the miRNAbound target mRNA such that for the purposes of protein production it is sufficient to track only the free target mRNA (r). Solving for the steady-state level of r yields: r= [r.,ag, where: - A -6 + 4untargeted ] [3.4] runtargeted = kR YR YR' + kof ko 6= 0al YR.[miRNA] YR Just as in the Buchler and Louis and Levine et al. cases, when the dissociation constant (here denoted by X) is small - meaning that the interaction strength is high between the miRNA and its target - it is possible to achieve a threshold-linear relationship between the free target mRNA and the total amount of mRNA (denoted by runtargeted, which in the experiments is reported by the eYFP signal). In our case, because we allow recycling of the miRNA following destruction of its bound target mRNA, the titration effect only becomes apparent when the rate at which free miRNAs are removed from the system (kon) is much larger than the rate at which they reappear in the system, which itself consists of two parts: unbinding of the miRNA from its target (kff) and destruction of the target (YR*). In the most extreme case, for example, where ko n>> ko + YR* such that X - 0 one obtains: -61( [3.5] 1 2 untargeted {0 1 untargeted untargeted if runtargeted < 6 1 *)2 [3.71 untargeted In this limit, we see that the constant 0 sets the level of expression at which the threshold takes place. k | gene | -- * P fYR translation free mRNA (r) - koff Go on miRNA miRNA-mRNA complex (r*) YR* Figure 3.5.1 Biochemistry of the miRNA-mediated gene regulatory system. copies/cell 30Ont miR-20a li I A tRNAo 0 Figure 3.5.2 miR-20 expression in Tet-On HeLa cells. a. Absolute miR-20 expression measured by northern blot. Total RNA from Tet-On HeLa cells transfected with various reporter constructs was probed for miR-20 expression compared to a standard curve of miR-20 mimic spiked into yeast RNA. tRNAgln serves as a loading control. b. Relative miR-20 expression above and below the threshold measured by RT-PCR. Cells transfected with the N = 7 target reporter or the N = 0 control reporter were sorted into low and high fractions. Total RNA was assayed for miR-20 and normalized to miR-31 as a loading control. Bar height and error bars represent the average relative normalized miR-20 value in the high fraction compared to the low fraction and the s.e.m. of three RT-PCR assays. 3.5.2 Tuning the dissociation constant X The qualitative shape of the transfer functions generated by the model depends on two key lumped parameters. The dissociation constant k governs the sharpness of the threshold, as seen in Figure 3.5.3. On a log-log plot relating r to runtargeted as seen in Figure 3.5.3 (right panel), the increased sharpness manifests itself as a line with slope (which we refer to as the logarithmic gain) greater than 1, marking an ultrasensitive transition connecting the branches of the transfer function of slope 1 that indicate little protein expression (below the ultrasensitive transition) and nearly maximal protein production (above the ultrasensitive transition). k is inversely proportional to the rate at which miRNA binds the target mRNA (kon); as kon increases at a constant kog, k decreases and thus sharpens the transition. low ko low k C,) 0 high kon high kon runtargeted |O g (runtargeted) Figure 3.5.3 Tuning the sharpness of the ultrasensitive switch by changing the rate at which miRNA bind their target mRNA, kon. As shown especially strikingly in the right panel, increasing kon has a dramatic effect on the strength of repression below the transition to escape from miRNA repression, thus sharpening the transition, but does not change the transcript level needed to encounter the transition. 3.5.3 Tuning the threshold constant 0 The threshold constant 0 plays a role in the placement of the threshold and also in the sharpness of the transition between the threshold and escape regimes, as seen in Figure 3.5.4. 0 is proportional to the concentration of free miRNA available within the cell; as the total concentration of free miRNAs increases, 0 increases and pushes the ultrasensitive transition to higher values of runtrgetedasdepicted in Figure 3.5.4b. O low [miRNA] low [miRNA] 0. high [miRNA] high [miRNA] runtargeted |og g(runtargeted) Figure 3.5.4 Tuning both the placement and sharpness of the ultrasensitive transition by titrating different total amounts of miRNA into the system. As with Figure 3.5.3, the key features of the effects of adding miRNA into the system are best shown in the log-log transfer function (right panel). As with increasing ko., increasing [miRNA] increases the strength of repression below the ultrasensitive transition. But unlike with kn changes, increasing the [miRNA] can also increase the threshold transcript level needed to escape beyond maximum fold-repression. 3.6 Experimentally tuning the ultrasensitive transitions The mathematical model thus suggests experiments that could be performed to modulate the ultrasensitive transitions generated by miRNA-mediated regulation. As our stable cell lines could not achieve high enough levels of reporter expression to capture the complete ultrasensitive transition to escape from miRNA-mediated repression, we carried out the remainder of our experiments by transiently transfecting HeLa cells with reporter constructs and measuring fluorescence via flow cytometry to increase the number of cells in our datasets. 3.6.1 Increasing N in the mCherry 3'-UTR To sharpen the transitions by increasing kon we engineered the 3' UTR of mCherry to increase N, the number of miRNA target sites. The maximum logarithmic gain increases smoothly from ~1 when N=l to 1.8 when N= 7, shown in Figure 3.6.1; as expected from the model, the effect is stronger going from 1 to 4 binding sites than from 4 to 7 sites. We were also able to recapitulate the transfer function with N=7 in the 3' UTR of eYFP, shown in Figure 3.6.2, thus isolating the effect to miR-20 mediated regulation rather than any property intrinsic to the mCherry reporter. Interestingly, unlike with previous experiments using bacterial sRNA [Levine 2007], we can also directly test the importance of titration to generate thresholds by using miR-20 binding sites that are perfectly complementary to the endogenous miR-20, thus converting the interaction between target and miRNA into a strongly catalytic, RNAi-type repression. We observe in Figure 3.6.1 (grey points) that when the miR-20 bulged binding sites are replaced by a perfectly complementary binding site that yields the same maximum repression as N=7 bulge sites and, the ultrasensitive transition is abolished altogether. 1 perfect .N 4 04e 0 3 *.0 3 4 log 10(eYFP) 5 Figure 3.6.1 Experimentally sharpening the ultrasensitive transition by engineering differing numbers of miR-20 binding sites into the 3'-UTR of mCherry. The angular symbols are meant to denote the derivative of the transfer function at the location indicated; this derivative is referred to as the logarithmic gain, which is a key system parameter characterizing the regulatory interaction. Additionally, we can abolish the ultrasensitive response by using a miR-20 binding site that is perfectly complementary to miR-20, as shown in the dark grey points. ....... .... .... ...... -slope= + 1 N=7 loglo(eYFP) Figure 3.6.2 Dye-swap control experiment. We observe a quantitatively similar logarithmic transfer function with a N=- 7 construct engineered to contain the miR-20 binding sites in the 3'-UTR of eYFP rather than mCherry, except that the curve is reflected about the y-x line as expected. This suggests that the thresholding with ultrasensitivity can be attributed to miR-20 mediated regulation, not to any property intrinsic to mCherry. 3.6.2 Calculating ratio transfer functions to measure fold repression To measure the fold repression as a function of target expression level, we measure the transfer function in the absence of miR-20 binding sites and calculate the ratio of this control transfer function to transfer functions in the presence of 1, 4, and 7 miR-20 sites; the results are plotted in Figure 3.6.3. As expected from Figure 3.6. 1, increasing the number of binding sites both increases the fold repression at lower eYFP levels, from just over 2-fold repression with a single miR-20 site to ~10-fold repression with seven miR20 sites, while not significantly changing the fold repression at high eYFP (Figure 3.6.3). Seen this way, we demonstrate that rather than being only a subtle effect as suggested by population-based averages, which in this case results in at most 2.5-fold repression with seven binding sites (see Figure 3.6.4), regulation by miR-20 can exert very strong repression of protein production at low target transcript levels. Moreover the boundary of 100 ........... ::::::::: ........ ............ M ................... .... .. . .. .... . ...... ....... the regime of strongest repression is marked by the ultrasensitive transition, so shifting this transition to lower or higher target mRNA levels can be of functional significance. *N=1 ON=4 - 10 *N 7 0 *0 40 41 0 x 104 1 2 3 eYFP 4 Figure 3.6.3 Calculating the fold repression due to miRNA as a function of target expression level. We obtain this plot by taking the ratio of the N = 0 curve to the N = 1, 4, and 7 curves. The magnitude of the fold repression stands in stark constrast to estimates from bulk measurements as in Figure 3.6.3. The potential functional significance of the ultrasensitive transition is also highlighted in this plot as it is precisely this transition that allows the regulatory system to tune through virtually all magnitudes of fold-repression just by regulating the target expression level. 101 0 C,) 10 N=1 N=4 N=7 Figure 3.6.4 Bulk level measurements of miR-20 mediated repression. The levels of repression observed using our fluorescent reporter system is on average similar to those previous observed. This data was calculated from flow cytometry: we compute the ratio of the mean eYFP level to the mean mCherry level for N = 1, 4, and 7. We then normalize this ratio by the mean eYFP to mean mCherry ratio for N= 0; we refer to this normalized ratio as the fold repression. Error bars are estimated by bootstrapping from the single cell flow cytometry data. 3.6.3 Changing [miR-20]otai by transfecting mimic siRNA Consistent with the model, the ultrasensitive transition can be shifted to either higher or lower eYFP levels by transfecting either miR-20 mimic oligonucleotides (siRNAs) or miRNA sponges that inhibit miR-20 activity [Ebert 2007] (Figure 3.6.5; Figure 3.6.6). Increasing the level of miRNA increased the fold-repression below the threshold; the threshold mRNA level needed for protein expression; and the sharpness of the transition. In the extreme case of 7 miR-20 binding sites with 30 nM miR-20 mimic transfected (Figure 3.6.5, right panel), miRNA-mediated repression can achieve -40-fold repression compared to a target with no miRNA binding site; the threshold is shifted to a 10-fold higher eYFP level; and the transition between repressed and unrepressed expression is quite sharp with a maximum logarithmic gain of ~5.4 (Figure 3.6.5, right panel), compared to -1.8 without the transfected miR-20 mimic, i.e. endogenous levels (Figure 3.6.1). 102 ............. :: ........... SN=0 + N 5 *N=7 mim + 30nM mimic -model A + UnM mimic + 9nM mimic 5 30nM mimic model + - 3.2 S 5.4 & e 4C 4 E E 3 4- 3 0 4- 2. 3 4 Iog1 (eYFP) 5 3 4 log 1O(eYFP) 5 Figure 3.6.5 Tuning the placement and the sharpness of the threshold by titrating the amount of miR-20 available to the gene regulatory system. As expected from the theoretical results, increasing the amount of miR-20 molecules in the system both greatly increases the maximal fold repression (which reaches 40-fold in the case of N= 7 with 3OnM miR-20 added) as well as the sharpness with which the transfer function snaps onto the unrepressed regime (with a maximal logarithmic gain of 5.4 in the case of N = 7). 103 a + control sponge * + miR-20 sponge * + control sponge + miR-20 sponge + control sponge * + miR-20 sponge 5 5D E 4 E 0 4 S3 4 1og 1 (eYFP) 5 log 1 (eYFP) logl,(eYFP) + control sponge + control sponge + miR-20 sponge * + miR-20 sponge 5 E 0 3 4 4 3 3 5 log,,(eYFP) 4 5 logl,(eYFP) Figure 3.6.6 miR-20 sponge experiments shift ultrasensitive regime to lower eYFP levels as expected from the mathematical model. a-e) Transfer functions resulting from cotransfection of indicated reporter system (N = 0, 1, 1 perfect, 4, and 7 respectively) with indicated sponge construct. To quantitatively compare the data to the model, we simultaneously fit all the datasets holding k constant across the fits to particular N and 0 constants for a particular amount of transfected siRNA mimic. Interestingly, we see that the fit parameter 0 increases with increasing siRNA mimic (Figure 3.6.7, left panel), but in a saturable fashion, while 1/k increases linearly with N (Figure 3.6.7, right panel). This suggests that the amount of transfected miRNA entering functional complexes is limited by entry into the cytoplasm or availability of miRNP components. 104 . 15 10 Z _410 8 -6 45 CD 2 X 104 . ... 0 10 OX 10-5 20 [miRNA]transfected (nM) 1 30 4 N 7 Figure 3.6.7 Results from simultaneous fitting of model to experimental data. The fitting results suggest that our manipulations to tune the regulatory system behaved as we expected: titrating in miR-20 mimicking siRNA increased the threshold constant, while increasing N increased 1/k. It is possible that the deviation of fitted 1I/X from the straight line is a signature of cooperative binding of miR-20 molecules to target mRNAs, but further study is needed to explore this. 3.6.4 eYFP mRNA abundance at the threshold In order to get a feel for whether or not the transitions occurred at target mRNA levels at all physiologically relevant, we sought to measure the eYFP transcript abundance at which the ultrasensitive transition began to occur. Using cell sorting via flow cytometry, we could isolate cellular subpopulations that exhibit only below-threshold and only above-threshold gene expression, as seen in Figure 3.6.8. We can then perform RT-PCR on eYFP mRNA transcripts from these two subpopulations to estimate the threshold mRNA abundance. From these estimates, the data suggests that the threshold transition occurs at approximately 100 target mRNAs per cell with seven typical sites in the 3' UTR at an endogenous level of approximately 2,000 miR-20 per cell, shown in Figure 3.6.8 (lower panel). 105 (G1: RI & R2) 104-r--- R--- -R4 (Gl -1R8 R1 & R2) 10 -Z N R1q_-- . . A7 R7 j. 10:1 '-:1 10 R6 100 10DO R 1 101 100 YFP Log 103 10p 10 104 10 10' 10Lo YFP Log 1.2 Fraction mRNAs per cell N=0 low 56+/-36 1 0.0 S0.6 N=0high Og 1066 +/-472 N=0 low N=7 low N=0 high N=7 high Figure 3.6.8 Estimating the mRNA abundance at the threshold generated by miR-20 mediated regulation. 3.7 Observing ultrasensitivity in physiological contexts In order to test the generality of these findings, that the strength of repression of a miRNA target depends strongly on the relative amounts of the miRNA and its target, we sought to recapitulate the results in more physiological settings. 106 3.7.1 Fusing natural 3'-UTRs to mCherry First, we tested whether similar ultrasensitive transitions would be observed when the reporter construct incorporated naturally occurring miRNA binding sequences by fusing the 3' UTRs of the oncogene HMGA2 and the major GABA transporter gene SLC6A1 to the mCherry reporter and performing dual-color FACS. The HMGA2 3' UTR contains seven binding sites for the miRNA family let-7, which is abundant in HeLa cells, while SLC6A1 contains three binding sites for the neuronal miRNA miR-218, which we supplied exogenously. The experiments, whose results are depicted in Figures 3.7.1, showed that we could indeed observe ultrasensitive transitions with these constructs and for HMGA2, we increased the ultrasensitive threshold incrementally by transfecting higher doses of let-7 siRNA mimic (Figure 3.7.1). a b 43 445 2 3 2 3 3 1 2.5 3.5 4 _5 4.5 log 10(eYFP) iog 10(eYt-v) * Mutant HMGA2 3 UTR HMGA2 3'UTR let-7 mimic HMGA2 3' UTR + 10knM * SLC6A1 3 UTR +30nM miR-218 imnic * HMGA2 3' UTR + 31nM let-7 mimic * HMGA2 3 UTR +10nM let-7 mimic Figure 3.7.1 Detecting ultrasensitive transitions with natural UTR's. a) Using the 3'-UTR from the gene HMGA2, which contains 7 binding sites for the miRNA let-7, we observe that we can tune the system to display a clear signature of ultrasensitivity (the slope of the log-log transfer function exceeds the slope = 1 guide to the eye shown in grey) when adding increasing amounts of let-7 mimic siRNA. b) The 3'-UTR from the neuronal gene SLC6A1 also shows a clear ultrasensitive transition when we add its targeting miRNA miR-218. 107 3.7.2 Luciferase assays in mouse embryonic stem cells Finally we used a standard dual-luciferase assay, shown in Figure 3.7.2 in schematic form, to measure target expression in mouse embryonic stem cells (ES cells) using only their endogenous pool of miRNA to retain physiological relevance. Furthermore, we measured a transfer function complementary to that in the experiments with Hela cells: the mRNA target level remained fixed while the miRNA concentration varied. To test varying miRNA concentrations we exploited the fact that different miRNA species are present at different abundances in ES cells. Finally, to gauge the strength of miRNA repression, target expression in wild-type ES cells was normalized to target expression in ES cells that lack the enzyme Dicer and thus contain no miRNAs. N=2 sites or CXCR4 control 3'UTR o F-uferase Transfect R-luc with 2 bulged miRNA sites or CXCR4 control sites; F-luc is the loading control Dcr +/+) Measure expression of construct with miRNA sites relative to construct with CXCR4 control sites relative expression in fold repression = relative expression in relative expression in relative expression in (r/ Figure 3.7.2 Dual luciferase assay system used to measure miRNA mediated repression in populations of mouse embryonic stem cells. We observe a similar threshold-linear curve in Figure 3.7.3 except that it reflected the level of miRNAs: at high miRNA abundances, fold-repression is 5-fold but decreases with miRNA abundance until at the lowest miRNA abundances target expression in wildtype cells is virtually indistinguishable from that in the miRNA-free Dcr~ cells. 108 ................... 6 C .0 5 3 .2 * I *e* 0 2 4 6 8 10 12 14 [miRNA] x 103 per cell Figure 3.7.3 Fold repression increases as a function of miRNA abundance in mouse embryonic stem cells. The miRNA abundance is estimated from semi-quantitative Northern blots, while the fold repression is measured using the dual luciferase assay detailed in Figure 3.7.2. See section 3.9 for more detailed methods. Data courtesy of Grace Zheng. The threshold in regulation by miRNA is determined by the level of the miRNA, and the number and affinity of the target sites. Many of these miRNAs as miRNP complexes could be bound to the endogenous miR-20 target mRNAs in the cell, leaving a limited pool for binding to the reporter mRNAs. Since these experiments are done at steady state conditions, this suggests that the miRNA system has very limited capacity to accommodate increases in target populations. These results are consistent with previous observations using "sponges" to suppress the activities of a family of miRNAs. Here expression of high levels of miR-20 target sites from an exogenously added sponge construct strongly suppressed miR-20 regulation of the target reporter; if endogenous miR-20 target mRNA production were to increase substantially, some escape from miR20 repression would also be expected (as seen in Figure 3.6.6). The sponge phenomenon has been observed in multiple mammalian and non-mammalian organisms indicating the general nature of this threshold behavior for miRNA regulation. 109 3.8 Discussion Our analysis of miRNA-mediated gene regulation at high target expression levels is consistent with previous bulk results, but measuring single cells offers a level of detail inaccessible to population-based assays. The detailed picture, which revealed the ultrasensitive response bounded by a high degree of repression at low target mRNA levels and little repression at high levels of target mRNA, may have important implications for miRNA-mediated regulation. There has been disparity between the concept of miRNAs as switches, exemplified by the lin-14 developmental switch in Caenorhabditiselegans where there is a high degree of repression by the miRNA Lin-4, versus many observations of miRNA-mediated regulation in mammalian cells where they are best considered as fine-tuners of gene expression. These results show that for any given miRNA-target interaction, the miRNA behaves both as a switch, in the target expression regime below the threshold, and as a fine-tuner, in the ultrasensitive transition between the threshold and the minimal repression regime at high mRNA levels. This model is consistent with miRNAs providing robustness to systems. Target mRNAs that were transcribed at low levels and/or only transiently would be strongly repressed but then upon increased and sustained expression the system could produce a rapid and irreversible transition to stable state expressing high levels of the target protein. The target expression thresholds generated by miRNAs could be important in development. Ultrasensitivity characterizes developmental switches such as cell fate decisions. To maintain their identity, differentiated cells must be able to distinguish between leaky and legitimate transcripts. Consistent with this, miRNAs are known to participate in feedback and feed-forward networks [Tsang 2007] and miRNA-mediated feedback networks have been implicated in imparting robustness in developing embryos [Stark 2005]. Molecular titration by tissue-specific miRNAs could set a threshold below which transcripts would be treated as leaky. Such a phenomenon is consistent with the observed tendency of mammalian miRNAs induced upon differentiation to target mRNAs that were highly expressed in the previous developmental stage [Farh 2005], and with the reported tendency of Drosophila miRNAs to target mRNAs that are highly expressed in neighboring tissues derived from a common progenitor [Stark 2005]. The 110 ultrasensitive transition would minimize the range of uncertainty between leaky and legitimate. Decisive on-off regulation of gene expression is necessary in differentiation and in the continual reinforcement of cell/tissue identity throughout the life of the animal. 3.9 Methods Fluorescent reporters were cloned into pTRE-Tight-BI (Clontech). NLS sequences were appended to the N-terminus of the eYFP and mCherry ORFs by PCR. The NLS-eYFP was inserted with EcoRI and NdeI. The NLS-mCherry was inserted with BamHI and Clal. Regulatory elements were placed into the eYFP 3' UTR with NdeI and XbaI; they were placed into the mCherry 3' UTR with Clal and EcoRV. 4x and 7x miR-20 sites were PCR-amplified from miR-20 sponge constructs. All constructs were sequenceconfirmed. HMGA2 w.t. and seed-mutant UTRs were a gift from Christine Mayr, David Bartel lab. The SLC6A1 3' UTR fragment was PCR-amplified from human genomic DNA. Generation of stable lines Reporter plasmids were linearized with AseI and cotransfected at 20:1 ratio with linear puromycin marker (Clontech). Transfected cells were selected in 2.5 ug/ml puromycin with 200 ug/ml G418. Individual eYFP-positive colonies were isolated, grown, and sorted for eYFP-positivity upon dox induction (MoFlo instrument). Fluorescence microscopy Cells were plated on glass-bottomed Nunc chambers (#1), induced with dox for 4 days, imaged in a Nikon TEI-2000 inverted fluorescence microscope with a Princeton Instruments Pixis back-cooled CCD camera. Images were processed using custom software in MATLAB. Briefly, following subtraction of camera background and any cellular autofluorescence, pixel values in both eYFP and mCherry channels corresponding to cells expressing the construct were extracted. The single-cell data were then binned along the eYFP axis. Figure ld reports the result of this binning procedure; the error bars are the standard errors of the mean within its corresponding bin. 111 subtract autofluorescent background log,,(eYFP) log,,(eYFP) I'. calculate mean mCherry ineach eYFP bin 7 I. - - / - -. I- 1og0 O(eYFP) Figure 3.9.1 Binning procedure used to convert joint mCherry-eYFP single cell distributions into transfer functions. The example showed here is from a typical flow cytometry experiment. The raw data in the first panel is first subjected to background subtraction. Then all the cells between two particular values of eYFP, as the schematic red column indicates in the second panel, are analyzed for their mCherry intensity value. The mean mCherry intensity is then plotted as a point in the third panel; the collection of all such points makes up the full transfer function depicted in the third panel. Transient transfection Tet-On HeLa cells (Clontech) below passage 10 were plated in G418 (Gibco) 200 ug/ml and doxycycline (Sigma) 1 ug/ml media in 12-well dishes the day before transfection. Reporter plasmids were diluted 1/50 in pUC18b carrier plasmid (Qiagen HiSpeed maxipreps) and mixed with DreamFect Gold (Oz Biosciences) at 4:1 ul reagent: ug DNA. miR-20a, let-7b, and miR-218 mimics (Dharmacon) were cotransfected at the indicated concentrations. For U6 sponge assays, reporter plasmids were diluted 1/50 in sponge 112 plasmid. Media was changed 24hr post-transfection. Assays were performed 48hr posttransfection. Flow cytometry Cells were run on LSRII analyzer (Becton Dickinson) with FACSDiva software. The raw FACS data were analyzed with FlowJo to gate cells according to their forward (FSC-A) and side (SSC-A) scatter profiles; specifically we chose cells near the peak of the (FSCA, SSC-A) distribution. Untransfected cells were used to characterize the cellular autofluorescence in the LSRII analyzer from which we obtain the mean and standard deviation of the autofluorescence distribution. Each cell's eYFP and mCherry fluorescence values were subtracted by the mean autofluorescence plus twice the standard deviation. Following background subtraction, cells with eYFP fluorescence levels less than 0 (i.e. indistinguishable from background) were excluded from further analysis and mCherry fluorescence levels less than 0 were set equal to 0. The single-cell data were then binned in the same manner as described above. Fluorescence-activated cell sorting Cells transfected with the N=0 or N=7 reporter were sorted 48hr post-transfection into low and high fractions using a MoFlo high-speed sorting instrument (DAKOCytomation). Cell pellets were washed and snap-frozen before RNA isolation. RT-PCR Total RNA was harvested using RNeasy Micro Plus kit with the protocol modified for inclusion of small RNAs (Qiagen). RNA was treated with DNaseI (Ambion) and reversetranscribed with oligo-dT primer using MMLV RTase (Ambion). qPCR for mCherry and eYFP was performed in triplicate reactions using SYBRGreen mix (Applied Biosystems), run on Applied Biosystems 7500 Real-Time PCR instrument. Single-stranded DNA standards spiked into untransfected cell cDNAs were used for estimation of mCherry mRNAs per cell. miR-20 was measured with miScript RT-PCR assay (Qiagen) in quadruplicate reactions using miR-31 and snoRNA as controls. 113 Small RNA Northern blot 24 ug of total RNA from transfected cells was run on 12% polyacrylamide gel (UreaGel system, National Diagnostics), with miR-20 mimic as a standard, spiked into yeast sheared total RNA (Ambion). The blot was probed for miR-20a and tRNAgn as loading control. mES cell luciferase assays Reporters were constructed by insertion of two bulged binding sites into the 3' UTR of CMV Renilla luciferase. Cells were transfected in triplicate in 24-well plates with 2ul Lipofectamine 2000 (Invitrogen), 0.0lug of CMV-Renilla plasmid, 0.lug of pGL3 (Promega), and 0.69ug of pWS (carrier plasmid). Cells were lysed and assayed 24hr posttransfection by Dual Luciferase reporter assay (Promega) using a on Glomax 20/20 luminometer (Promega). 114 115 116 Chapter 4 Conclusion and Perspectives In this Thesis, I have attempted to address a simple question that confronts any living cell: how does it resolve the tension between maintaining itself unchanged in the face of the random forces constantly confronting it but then changing when circumstances demand? The tension exists because either extremal behavior would lead to suboptimal outcomes for any given challenge the cell faces. If the cell rapidly changed with every fluctuating input, then it is likely it could devote excessive resources to responding to minor challenges or worse, especially in the context of the development of a multicellular organism for example, switch on a gene expression program in response to an aberrant signal generated by stochastic fluctuations. On the other hand, if the cell robustly maintained its dynamical behavior despite real changes in its environment that cannot be averaged out and must instead be responded to then its existence could be threatened. Clearly the cell must be able to balance these conflicting goals. The case of microRNA mediated gene regulation provides a simple, intuitive example of how the cell can use simple regulatory components - specifically the nonlinearities in the biochemistry of these components - to balance these goals. By examining how single cells employ microRNA mediated regulation, we were able to observe that microRNAs establish gene expression thresholds: transcriptional activity must reach a critical threshold level before appreciable protein product is generated for the cell. The 117 thresholding phenomenon thus establishes two regimes in the gene expression system: below the threshold expression is robustly switched off, while above the threshold the system can explore every possible degree of repression until escaping from microRNA mediated regulation altogether. In this case, the balance between robustness and tunability is achieved directly by biochemical details: the affinity of the microRNA for its target and the relative abundance of the microRNA and its target determine the strength of the nonlinearities in the interaction that in turn mark the decision boundary between robust and tunable expression regimes. Very strong binding and very abundant levels of microRNA compared to target mRNA levels most clearly delineate when the cell will robustly keep expression silenced and when it will allow proteins to be produced. While microRNA mediated regulation provided an example of how to be robust yet tunable in the face of biochemical fluctuations, the case of the yeast osmosensing pathway provides a concrete example of how simple networks can achieve this balance against genetic perturbations. By conducting a systematic complementation study in which we substituted orthologs for endogenous HOG pathway genes, we observed that the large-scale genetic changes introduced into the pathway were only readily observable in the downstream MAPK module rather than the upstream phosphorelay module. In this case the balance between robustness and tunability is achieved by decomposition: in the context of the HOG pathway, the input/output relationship of the phosphorelay Module is relatively robust to changes in the sequence of its constituent genes while that of the MAPK module can be tuned genetically. Similar to the microRNA system, our modeling suggests, the decision boundary in this case is due to biochemistry. According to our modeling, stoichiometric signaling as exemplified by the phosphorelay module, because its steady states are entirely determined the inflow and outflow of the signaling molecule (phosphate in our case) with all internal details suppressed, is more robust to changes in kinetic rates than catalytic signaling. Catalytic signaling as exemplified by the MAPK module, with its potential for signal amplification, can be tuned by each rate constant in the reaction scheme and thus by any change in protein sequence that can affect any of those rates. 118 There are clearly many other routes toward achieving this balance, some of which have been explored but some of which are in need of attention. Arguably the best explored of these other routes, at least in the systems biology literature, is positive feedback, especially when it generates bistability. Taking an analogy from statistical mechanics, which is commonly done in visualizing bistable systems, imagine the system has two states separated by a potential energy barrier. The most natural way to construct a robust yet tunable system from this picture is for the high state to have a relatively flat profile: output (e.g. gene expression) In such a system, the potential energy barrier marks the decision between robustly keeping the output of the system in its low state, while the flat landscape above the barrier allows the system to be tuned to a variety of values in the high state. Positive feedback can even be used in conjunction with other regulatory schemes, such as delayed negative feedback, to help make the dominant dynamics of those other regulatory schemes, such as oscillation, more robust and tunable (see Section 1.5 for examples). Another way to achieve this balance is to exploit the potential for combinatorial control of a given process. Downstream gene expression systems can ignore the presence of a particular signaling molecule unless sensitized by the presence of a separate signaling molecule, after which the amount of the first signaling molecule can dictate the magnitude of the transcriptional response. Such a principle could underlie the concept of "danger" signals in the immune system, which is very clearly an example of a system that must both be robustly silenced when its functions are not needed for the organism but then must be tuned to have an appropriate response, neither too vigorous nor too weak. 119 One aspect of this balance between robustness and tunability that has been clearly underexplored experimentally is the case of spatially structured systems. These issues have been begun to be explored in some studies in developmental biology [Gregor 2007, Ben-Zvi 2010]. For example, spatial averaging could play a role in determining the robustness of pattern formation systems to spatial fluctuations: the number of cells over which a particular molecular abundance is averaged can determine the length scale over which a spatial fluctuation will persist. Robust averaging below this length scale can smooth out unwanted fluctuations but still allow patterns to be sculpted at longer length scales. The tension between robustly ignoring perturbations and tunably responding to them when necessary is among the core tensions in the life of the cell. The studies presented here add two new examples of how cells exhibit both robust and tunable behaviors. We hope this and the work of our colleagues will inspire future studies addressing these two aspects of biology. 120 121 122 References Acar M, Becskei A, van Oudenaarden A. Enhancement of- cellular memory by reducing stochastic transitions. Nature 435(7039): 228-232 (2005) Acar M, Mettetal JT, van Oudenaarden A. Stochastic switching as a survival strategy in fluctuating environments. Nat. Genet. 40: 471-475 (2008) Albertyn J, Homann S, Thevelein JM, Prior BA. GPD1, which encodes glycerol-3phosphate dehydrogenase, is essential for growth under osmotic stress in Saccharomyces cerevisiae, and its expression is regulated by the high-osmolarity glycerol response pathway. Mol Cell Biol. 14: 4135-4144 (1994) Alon U, Surette MG, Barkai N and Leibler S. Robustness in bacteria chemotaxis. Nature 397: 168-171 (1999) Alon U. Network motifs: theory and experimental approaches. Nat Rev Genet 8: 450461 (2007) Anderson JC, Clarke EJ, Arkin AP, Voigt CA. Environmentally controlled invasion of cancer cells by engineered bacteria. J Mol Biol. 355, 619-627 (2006) Antunes MS, Morey KJ, Tewari-Singh N, Bowen TA, Smith JJ, Webb CT, Hellinga HW, Medford JI. Engineering key components in a synthetic eukaryotic signal transduction pathway. Mol Syst Bio 5: 270 (2009) Armbruster BN, Li X, Pausch MH, Herlitze S, Roth BL. Evolving the lock to fit the key to create a family of G protein-coupled receptors potently activated by an inert ligand. PNAS 104, 5163-5168 (2007) Atkinson MR, Savageau MA, Myers JT, Ninfa AJ. Development of genetic circuitry exhibiting toggle switch or oscillatory behavior in Escherichiacoli. Cell 113(5): 597607 (2003) 123 Baek D, Villen J, Shin C, Camargo FD, Gygi SP, Bartel DP. The impact of microRNAs on protein output. Nature 455: 64-71 (2008) Bagga S, Bracht J, Hunter S, Massirer K, Holtz J, Eachus R, Pasquinelli AE. Regulation by let-7 and lin-4 miRNAs results in target mRNA degradation. Cell 122: 553-563 (2005) Balagadde FK, Song H, Ozaki J, Collins CH, Barnet M, Arnold FH, Quake SR, You L. A synthetic Escherichia coli predator-prety ecosystem. Mol Syst Biol 4: 187 (2008) Barkai N, Leibler S. Robustness in simple biochemical networks. Nature 387, 913917 (1997) Bartel DP, Chen CZ. Micromanagers of gene expression: the potentially widespread influence of metazoan microRNAs. Nat Rev Genet. 5(5): 396-400 (2004) Bashor CJ, Helman NC, Yan S, Lim WA. Using engineered scaffold interactions to reshape MAP kinase pathway signaling dynamics. Science 319(5869): 1539-1543 (2008) Basu S, Mehreja R, Thiberge S, Chen MT, Weiss R. Spatiotemporal control of gene expression with pulse-generating networks. PNAS 101(17): 6355-6360 (2004) Basu S, Gerchman Y, Collins CH, Arnold FH, Weiss R. A synthetic multicellular system for programmed pattern formation. Nature 434(7037): 1130-1134 (2005) Batchelor E, Goulian M. Robustness and the cycle of phosphorylation and dephosphorylation in a two-component regulatory system. Proc. Natl. Acad. Sci. USA 100: 691-696 (2002) Becskei A and Serrano L. Engineering stability in gene networks by autoregulation. Nature 405: 590-593 (2000) 124 Becskei A and Serrano L. Positive feedback in eukaryotic gene networks: cell differentiation by graded to binary response conversion. EMBO J. 20(10): 2528-2535 (2001) Becskei A, Kaufmann BB, van Oudenaarden A. Contributions of low molecule number and chromosomal positioning to stochastic gene expression. Nat. Genet. 37, 937-944 (2005) Behar M, Dohlman HG, Elston TC. Kinetic insulation as an effective mechanism for achieving pathway specificity in intracellular signaling networks. PNAS 104, 1614616151 (2007) Beisel CL, Bayer TS, Hoff KG, Smolke CD. Model-guided design of ligandregulated RNAi for programmable control of gene expression. Mol Syst Biol 4: 224 (2008) Ben-Zvi D, Barkai N. Scaling of morphogen gradients by an expansion-repression integral feedback control. Proc Natl Acad Sci 107(15): 6924 - 6929 (2010) Bernstein E, Kim SY, Carmell MA, Murchison EP, Alcorn H, Li MZ, Mills AA, Elledge SJ, Anderson KV, Hannon GJ. Dicer is essential for mouse development. Nat Genet 35(3): 215-217 (2003) Bhattacharyya RP, Remenyi A, Yeh BJ, Lim WA. Domains, motifs, and scaffolds: the role of modular interactions in the evolution and wiring of cell signaling circuits. Ann Rev Biochem 75: 655-680 (2006) Borneman AR, Gianoulis TA, Zhang ZD, Yu H, Rozowsky J, Seringhaus MR, Wang LY, Gerstein M, Snyder M. Divergence of transcription factor binding sites across related yeast species. Science 317: 815-819 (2007) Brenner K, Karig DK, Weiss R, Arnold FH. Engineered bidirectional communication mediates a consensus in a microbial biofilm consortium. PNAS 104(44):17300-17304 (2007) 125 Brewster JL, de Valoir T, Dwyer ND, Winter E, Gustin MC An osmosensing signal transduction pathway in yeast. Science 259: 1760-1763 (1993) Bridgham JT, Carroll SM, Thornton JW. Evolution of hormone-receptor complexity by molecular exploitation. Science 312, 97-101 (2006) Buchler NE, Gerland U, Hwa T. On schemes of combinatorial transcription logic. PNAS 100, 5136-5141 (2003) Buchler N and Louis M. Molecular titration and ultrasensitivity in regulatory networks. J Mol Biol. 384(5): 1106-1119 (2008) Bulter, T. et al. From the Cover: Design of artificial cell-cell communication using gene and metabolic networks. Proc. Natl. Acad. Sci. U. S. A. 101, 2299-2304 (2004) Cai L, Friedman N, Xie XS. Stochastic protein expression in individual cells at the single molecule level. Nature 440, 358-362 (2006) Cambridge SB, Geissler D, Keller S, Curten B. A caged doxycycline analogue for photoactivatable gene expression. Angew Chem Int Ed. 45: 2229-2231 (2006) Cantone I, Marucci L, Iorio F, Ricci MA, Belcastro V, Bansal M, Santini S, di Bernardo M, di Bernardo D, Cosma MP. A yeast synthetic network for in vivo assessment of reverse-engineering and modeling approaches. Cell 137(1): 172-181 (2009) Chen C-Z, Li L, Lodish HF, Bartel DP. MicroRNAs modulate hematopoietic lineage differentiation. Science 303(5654): 83-86 (2004) Chen M-T, Weiss R. Artificial cell-cell communication in yeast Saccharomyces cerevisiae using signaling elements from Arabidopsis thaliana.Nat.Biotech. 23:15511555 (2005) 126 Cherry JM, Adler C, Ball C, Chervitz SA, Dwight SS, Hester ET, Jia Y, Juvik G, Roe T, Schroeder M, et al. SGD: Saccharomyces Genome Database. Nucleic Acids Research 26: 73-79 (1998) Chuang JS, Rivoire 0, Leibler S. Simpson's paradox in a synthetic microbial system. Science 323(5911): 272-275 (2009) Cimmino A, Calin GA, Fabbri M, Iorio MV, Ferracin M, Shimizu M, et al. miR-15 and miR-16 induce apoptosis by targeting Bcl2. Proc Natl Acad Sci. 102(39): 1394413949 (2005) Cironi P, Swinburne IA, Silver PA. Enhancement of cell type specificity by quantitative modulation of a chimeric ligand. J. Biol. Chem. 283: 8469-8476 (2008) Conrad ED, Tyson JT. Modeling molecular interaction networks with nonlinear ordinary differential equations. System Modeling in Cellular Biology, ed. Z. Szallasi, J. Stelling and V. Periwal: 116-118, MIT Press (2006) Cox RS 3rd, Surette MG, Elowitz MB. Programming gene expression with combinatorial promoters. Mol. Syst. Bio. 3: 145 (2007) Cruz FG, Koh JT, Link KH. Light activated gene expression. JACS 122: 8777-8778 (2000) Davidson EA. The regulatory genome: gene regulatory networks in development and evolution. Academic Press. (2006) Davidson EA, Ellington AD. Synthetic RNA circuits. Nat Chem Biol 3, 23-28 (2007) Deans TL, Cantor CR, Collins JJ. A tunable genetic switch based on RNAi and repressor proteins for regulating gene expression in mammalian cells. Cell 130, 363372 (2007) Desai SK, Gallivan JP. Genetic screens and selections for small molecules based on a synthetic riboswitch that activates protein translation. JACS 126: 13247-13254 (2004) 127 de Visser JAGM, Hermisson J, Wagner GP, Ancel Meyers L, Bagheri-Chaichian, H, Blanchard JL, Chao L, Cheverud JM, Elena SF, Fontana W, et al. Perspective: Evolution and detection of genetic robustness. Evolution 57: 1959-1972 (2003) Dueber JE, et al. Synthetic protein scaffolds provide modular control over metabolic flux. Nature Biotechnology 27: 753-759 (2009) Dugave C and Demange L. Cis-trans isomerization of organic molecules and biomolecules: implications and applications. Chem Rev 103: 2475-2532 (2003) Ebert MS, Neilson JR, Sharp PA. MicroRNA sponges: competitive inhibitors of small RNAs in mammalian cells. Nature Methods 4(9): 721-726 (2007) Elf J, Paulsson J, Berg OG, Ehrenberg M. Near-critical phenomena in intracellular metabolite pools. Biophys J 84: 154-170 (2003) Ellis T, Wang X, Collins JJ. Diversity-based, model-guided construction of synthetic gene networks with predicted functions. Nat Biotech 27(5): 465-471 (2009) Elowitz MB, Leibler S. A synthetic oscillatory network of transcriptional regulators. Nature 403(6767): 335-338 (2000) Elowitz MB, Levine AJ, Siggia ED, Swain PS. Science 297(5584): 1183-1186 (2002) Esau C, Kang X, Peralta E, Hanson E, Marcusson EG, et al. MicroRNA-143 regulates adipocte differentiation. J. Biol Chem. 279: 52361-52365 (2004) Farh KK, Grimson A, Jan C, Lewis BP, Johnston WK, Lim LP, Burge CB, Bartel DP. The widespread impact of mammalian microRNAs on mRNA repression and evolution. Science. 310(5755):1817-21 (2005) Ferrigno P, Posas F, Koepp D, Saito H, Silver P. Regulated nucleo/cytoplasmic exchange of HOGI MAPK requires the importin beta homologs NMD5 and XPO1. EMBO J. 17: 5606-5614 (1998) 128 Friedland A, Lu TK, Wang X, Shi D, Church G, Collins JJ. Synthetic gene networks that count. Science 324 (5931): 1199-1202 (2009) Friedman RC, Farh KK, Burge CB, Bartel DP. Most Mammalian mRNAs are conserved targets of microRNAs. Genome Research 19: 92-105 (2009) Fung E, Wong WW, Suen JK, Bulter T, Lee S, Liao JC. A synthetic gene-metabolic oscillator. Nature 435: 118-122 (2005) Gerhart J, Kirschner M. The theory of facilitated variation. PNAS 104 Suppl 1: 85828589 (2007) Gertz J, Siggia ED, Cohen BA. Analysis of combinatorial cis-regulation in synthetic and genomic promoters. Nature 457, 215-218 (2009) Geva-Zatorsky N, Rosenfeld N, Itzkovitz S, Milo R, Sigal A, Dekel E, Yarnitzky T, Liron Y, Polak P, Lahav G, Alon U. Oscillations and variability in the p53 system. Mol Syst Biol 2:2006.0033 (2006) Gilbert ES, Walker AW, Keasling JD. A constructed microbial consortium for biodegradation of the organophosphorus insecticide parathion. Appl. Microbiol. Biotechnol. 61:77-81. (2003) Golding I, Paulsson J, Zawilski SM, Cox EC. Real-time kinetics of gene activity in individual bacteria. Cell 123, 1025-1036 (2005) Gore J, Youk H, van Oudenaarden A. Snowdrift game dynamics and facultative cheating in yeast. Nature 459(7244): 253-256 (2009) Grate D, Wilson C. Inducible regulation of the S. cerevisiae cell cycle mediated by an RNA aptamer-ligand complex. Bioorg Med Chem 9: 2565-2570 (2001) Gregor T, Wieschaus EF, McGregor AP, Bialek W, Tank DW. Stability and nuclear dynamics of the bicoid morphogen gradient. Cell 130(1): 141-152 (2007) 129 Grilly C, Stricker J, Pang WL, Bennett MR, Hasty J. A synthetic gene network for tuning protein degradation in Saccharomyces cerevisiae. Mol Syst Bio 3: 127 (2007) Grimson, A., M. Srivasatva, B. Fahey, B.J. Woodcroft, H.R. Chiang, N. King, B.M. Degnan, D.S. Rokhsar, and D.P. Bartel. Early origins and evolution of microRNAs and Piwi-interacting RNAs in animals. Nature 455: 1193-1197 (2008) Guet CC, Elowitz MB, Hsing W, Leibler S. Combinatorial synthesis of genetic networks. Science 296, 1466-1470 (2002) Hammer K, Mijakovic I, Jensen PR. Synthetic promoter libraries - tuning of gene expression. Trends Biotechnol. 24, 53-55 (2006) Harris K, Lamson RE, Nelson B., Hughes TR, Marton MJ, Roberts CJ, Boone C, Pryciak PM, Role of scaffolds in MAP kinase pathway specificity revealed by custom design of pathway-dedicated signaling proteins. Curr. Biol. 11, 1815-1824 (2001) Hasty J, Dolnik M, Rottschafer V, Collins JJ. Synthetic gene network for entraining and amplifying cellular oscillations. Phys Rev Lett 88(14): 148101 (2002) Hersen P, McClean MN, Mahadevan L, Ramanathan S. Signal processing by the HOG MAP kinase pathway. PNAS 105, 7165-7170 (2008) Hoffman CS UNIT 13.11 Preparation of Yeast DNA. In Current Protocols in Molecular Biology: Ausubel FM, Brent R, Kingston RE, Moore DD, Seidman JG, Smith JA, and Struhl K, eds. John Wiley and Sons (2003) Hohmann, S. Osmotic stress signaling and osmoadaptation in yeasts. Microbiol. Mol. Biol. Rev. 66: 300-372 (2002) Hooshangi S, Thiberge S, Weiss R. Ultrasensitivity and noise propagation in a synthetic transcriptional cascade. PNAS 102(10): 3581-3586 (2005) 130 Howard PL, Chia MC, Del Rizzo S, Liu FF, Pawson T. Redirecting tyrosine kinase signaling to an apoptotic caspase pathway through chimeric adaptor proteins, PNAS 100:11267-11272(2003) Isaacs FJ, Hasty J, Collins JJ. Prediction and measurement of an autoregulatory genetic module. PNAS 100(13): 7714-7719 (2003) Isaacs FJ, Dwyer DJ, Ding C, Pervouchine DD, Cantor CR, Collins JJ. Engineered riboregulators enable post-transcriptional control of gene expression. Nat Biotech 22, 841-847 (2004) Isalan M, Lemerle C, Serrano L. Engineering gene networks to emulate Drosophila embryonic pattern formation. PLoS Biol. 3, e64 (2005) Isalan M, Lemerle C, Michalodimitrakis K, Horn C, Beltrao P, Raineri E, GarrigaCanut M, Serrano L. Evolvability and hierarchy in rewired bacterial gene networks. Nature 452, 840-845 (2008) Jacob F and Monod J. Genetic regulatory mechanisms in the synthesis of proteins. J Mol Biol 3: 318-356 (1961) Janiak-Spens F, Cook PF, West AH. Kinetic analysis of YPD1-dependent phosphotransfer reactions in the yeast osmoregulatory phosphorelay system. Biochemistry (44): 377-386 (2005) Janin J, Chothia C. Domains in proteins: definitions, location, and structural principles. Methods Enzymol 115: 420-30 (1985) John B, Enright AJ, Aravin A, Tuschl T, Sander C, Marks DS. Human microRNA targets. PLoS Biology 2(11): e363 (2004) Kaplan J and DeGrado WF. De novo design of catalytic proteins. PNAS 101, 1156611570 (2004) 131 Khosla C, Keasling JD. Metabolic engineering for drug discovery and development. Nature Reviews Drug Discovery 2, 1019-1025 (2003) Kinhabwala A and Guet CC. Uncovering cis regulatory codes using synthetic promoter shuffling. PLoS One 3, e2030 (2008) Kitano H. Biological robustness. Nat Rev Genet 5: 826-837 (2004) Klipp E, Nordlander B, Kruger R, Gennemark P, Hohmann S. Integrative model of the response of yeast to osmotic shock. Nat. Biotechnol. 23: 975-982 (2005) Kornmann B, Currie E, Collins SR, Schuldiner M, Nunnari J, Weissman JS, Walter P. An ER-mitochondria tethering complex revealed by a synthetic biology screen. Science 325(5939): 477-481 (2009) Krantz M, Becit E, Hohmann S. Comparative genomics of the HOG-signaling system in fungi. Curr. Genet. 49: 137-151 (2006) Kwon 0, Georgellis D, Lin ECC. Rotational On-off switching of a hybrid membrane sensor kinase Tar-ArcB in Escherichia coli. Journal of Biological Chemistry. 278(15): 13192-13195 (2003) Le MTN, Xie H, et al. MicroRNA-125b promotes neuronal differentiation in human cells by repressing multiple targets. Mol and Cell Biol 29(19): 5290-5305 (2009) Levine E, Zhang Z, Kuhlman T, Hwa T. Quantitative characteristics of gene regulation by small RNA. PLoS Biol 5(9): e229 (2007) Levskaya A, Chevlier AA, Tabor JJ, Simpson ZB, Lavery LA, Levy M, Davidson EA, Scouras A, Ellington AD, Marcotte EM, Voigt CA. Engineering bacteria to see light. Nature 438: 441-442 (2005) 132 Lewis BP, Burge CB, Bartel DP. Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell 120: 15-20 (2005) Li X, Carthew RW. A microRNA mediates EGF receptor signaling and promotes photoreceptor differentiation in the Drosophila eye. Cell 123(7): 1267-1277 (2005) Li X, Cassidy JJ, Reinke CA, Fischboeck S, Carthew RW. A microRNA imparts robustness against environmental fluctuation during development. Cell 137(2): 273282 (2009) Li Y, Wang F, Lee JA, Gao FB. MicroRNA-9a ensures the precise specification of sensory organ precursors in Drosophila. Genes Dev. 20(20): 2793-2805 (2006) Lim LP, Lau NC, Weinstein EG, Abdelhakim A, Yekta S, Rhoades MW, Burge CB, Bartel DP. The microRNAs of Caenorhabditiselegans. Genes Dev 17(8): 991-1008 (2003) Maeda T, Wurgler-Murphy SM, and Saito H. A two-component system that regulates an osmosensing MAP kinase cascade in yeast. Nature 369: 242-245 (1994) Maeda YT, Sano M. Regulatory dynamics of synthetic gene networks with positive feedback. J Mol Biol 359(4): 1107-1124 (2006) Makeyev EV, Zhang J, Carrasco MA, Maniatis T. The microRNA miR-124 promotes neuronal differentiation by triggering brain-specific alternative-pre-mRNA splicing. Mol. Cell 27(3): 435-448 (2007) Martin CH, Nielsen DR, Solomon KV, Prather KL. Synthetic metabolism: engineering biology at the protein and pathway scales. Chemistry and Biology 16, 277-286 (2009) Mattick JS. RNA regulation: a new genetics? Nat Rev Genet. 5, 316-323 (2004) 133 McClean MN, Mody A, Broach JR, Ramanathan S. Cross-talk and decision making in MAP kinase pathways. Nat Genet 39, 409-414 (2007) Mettetal JT, Muzzey D, Gomez-Uribe C, van Oudenaarden A. The frequency dependence of osmo-adaptation in Saccharomyces cerevisiae. Science 319, 482-484 (2008) Moriya H, Shimizu-Yoshida Y, Kitano H. In vivo robustness analysis of cell division cycle genes in Saccharomyces cerevisiae. PLoS Genet. 2(7): el 11. 10.1371/journal.pgen.0020111(2006) Muzzey D, Gomez-Uribe C, Mettetal J, van Oudenaarden A. (2009). A systems-level analysis of perfect adaptation in yeast osmoregulation. Cell 138: 160-171 (2009) Nelson DE, et al. Oscillations in NF-kB signaling control the dynamics of gene expression. Science 306(5696): 704-708 (2004) Odom DT, Dowell RD, Jacobsen ES, Gordon W, Danford TW, MacIsaac KD, Rolfe PA, Conboy CM, Gifford DK, Fraenkel E. Tissue-specific transcriptional regulation has diverged significantly between human and mouse. Nat. Genet. 39: 730-732 (2007) O'Rourke S, Herskowitz I. Unique and redundant roles for HOG MAPK pathway components as revealed by whole-genome expression analysis. Mol. Biol. Cell 15: 532-542 (2004) Ozbudak EM, Thattai M, Kurtser I, Grossman AD, van Oudenaarden A. Regulation of noise in the expression of a single gene. Nature Genetics 31, 69-73 (2002) Ozbudak E, Thattai M, Lim HN, Shraiman BI, van Oudenaarden A. Multistability in the lactose utilization network of Escherichia coli. Nature 427(6976): 737-740. (2004) 134 Pedraza J, van Oudenaarden A. Noise propagation in gene networks. Science 307, 1965-1968 (2005) Pokharel S and Beal PA. High-throughput screening for function adenosine to inosine RNA editing systems. JACS Chem Biol 1, 761-765 (2006) Pomerening JR, Sontag ED, Ferrell JE. Building a cell cycle oscillator: hysterisis and bistability in the activation of Cdc2. Nat Cell Biol 5(4): 346-351 (2003) Pomerening JR, Kim SY, Ferrell JE. Systems-level dissection of the cell-cycle oscillator: bypassing positive feedback produces damped oscillations. Cell 122(4): 565-578 (2005) Posas F, Wurgler-Murphy SM, Witten EA, Thai TC and Saito H. Yeast HOGi MAP kinase cascade is regulated by a multistep phosphorelay mechanism in the SLN1YPD1-SSK1 "two-component" osmosensor. Cell 86: 865-875. (1996) Proft M, Struhl K. MAP Kinase-mediated stress relief that precedes and regulates the timing of transcriptional induction. Cell 118, 351-361 (2004) Prud'homme BP, Gompel N, Carroll SB. Emerging principles of regulatory evolution. Proc.NatL. Acad Sci. USA, 104 Suppl 1:8605-86012. (2007) Raj A, Peskin CS, Tranchina D, Vargas DY, Tyagi S. Stochastic mRNA synthesis in mammalian cells. PLoS Biology 4, e309 (2006) Raj A and van Oudenaarden A. Nature, nurture, or chance: stochastic gene expression and its consequences. Cell 135: 216-226 (2008) Rajendran M, Ellington AD. Selection of fluorescent aptamer beacons that light up in the presence of zinc. Anal. Bioanal. Chem. 390, 1067-1075 (2008). Rapp M, Seppala S, Granseth E, von Heijne G. Emulating membrane protein evolution by rational design. Science 315, 1282-1284 (2007) 135 Remenyi A, Good MC, Lim WA. Docking interactions in protein kinase and phosphatase networks. Curr Opin in Struct Biol 16: 676-685 (2006) Ro D, et al. Production of the antimalarial drug precursor artemisinc acid in engineered yeast. Nature 440, 940-943 (2006) Rosenfeld N, Elowitz MB, Alon U. Negative autoregulation speeds the response times of transcription networks. J Mol Biol 323: 785-793 (2002) Rosenfeld N, Young JW, Alon U, Swain PS, Elowitz MB. Gene regulation at the single-cell level. Science 307(5717): 1962-1965 (2005) Rothlisberger D, Khersonsky 0, Wollacott AM, Jian L, DeChancie J, Betker J, Gallaher JL, Althoff EA, Zanghellini A, Dym 0, Albeck S, Houk KN, Tawfik DS, Baker D. Kemp elimination catalysts by computational enzyme design. Nature 453, 190-195 (2008) Segal E, Widom J. From DNA sequence to transcriptional behaviour: a quantitative approach. Nat. Rev. Genetics 10, 443-456 (2009) Selbach M, Schwanhausser B, Thierfelder N, Fang Z, Khanin R, Rajewsky N. Widespread changes in protein synthesis induced by microRNAs. Nature 455: 58-63 (2008) Setty Y, Mayo AE, Surette MG, Alon U. Detailed map of a cis-regluatory input function. PNAS 100, 7702-7707 (2003) Shimizu-Sato S, Huq E, Tepperman JM, Quail PH. A light-switchable gene promoter system. Nat Biotech 20: 1041-1044 (2002) Shinar G, Milo R, Martinez R, Alon U. Input-output robustness in simple bacteria signaling systems. Proc. Natl. Acad. Sci. USA 104: 19931-19935 (2007) 136 Skerker JM, Perchuk BS, Siryaporn A, Lubin EA, Ashenberg 0, Goulian M, Laub MT. Rewiring the specificity of two component signal transduction systems. Cell 133: 1043-1054 (2008) Sluijter JPG, van Mil A, van Vliet P, Metz CGH, Liu J, Doevendans PA, Goumans M-J. MicroRNA-1 and -499 regulate differentiation and proliferation in humanderived cardiomyocyte progenitor cells. Arteriosclerosis, Thrombosis, and Vascular Biology (2010) Song G, Zhang Y, Wang L. MicroRNA-206 targets notch3, activates apoptosis, and inhibits tumor cell migration and focus formation. J Biol. Chem. 284: 31921-31927 (2009) Sood P, Krek A, Zavolan M, Macino G, Rajewsky N. Cell-type-specific signatures of microRNAs on target mRNA expression. Proc Natl Acad Sci 103(8): 2746-2751 (2006) Stark A, Brennecke J, Bushati N, Russell RB, Cohen SM. Animal microRNAs confer robustness to gene expression and have a significant impact on 3' UTR evolution. Cell. 123(6):1133-46 (2005) Steen EJ, Chan R, Prasad N, Myers S, Petzold CJ, Redding A, Ouellet M, Keasling JD. Metabolic engineering of Saccharomyces cerevisiae for the production of nbutanol. Microb Cell Fact. 7:36 (2008) Stricker J, Cookson S, Bennett MR, Mather WH, Tsimring LS, Hasty J A fast, robust and tunable synthetic gene oscillator. Nature 456(7221): 516-519 (2008) Suess B, Fink B, Berens C, Stentz R, Hillen WA. A theophylline responsive ribswitch based on helix slipping controls gene expression in vivo. Nuc Acids Res 32: 16101614 (2004) 137 Swinbume IA, Miguez DG, Landgraf D, Silver PA. Intron length increases oscillatory periods of gene expression in animal cells. Genes Dev 22(17): 2342-2346 (2008) Tanay A, Regev A, Shamir R. Conservation and evolvability in regulatory network: The evolution of ribosomal regulation in yeast. Proc. Natl. Acad. Sci. USA 102: 7203-7208 (2005) Tanouchi Y, Pai A, You L. Decoding biological principles using gene circuits. Mol. BioSyst. 5: 695-703 (2009) Tatebayashi K, Takekawa M, Saito H. A docking site determining specificity of Pbs2 MAPKK for Ssk2/Ssk22 MAPKKKs in the yeast HOG pathway. EMBO J 22, 36243634 (2003) Taylor RJ, Falconnet D, Niemisto A, Ramsey SA, Prinz S, Shmulevich I, Galitski T, Hansen CL. Dynamic analysis of MAPK signaling using a high-throughput microfluidic single-cell imaging platform. PNAS 106, 3758-3763 (2009) Tigges M, Marquez-Lago TT, Stelling J, Fussenegger M. A tunable synthetic mammalian oscillator. Nature 457(7227): 309-312 (2009) Tsai TY, Choi YS, Ma W, Pomerening JR, Tang C, Ferrell JE. Robust, tunable biological oscillations from interlinked positive and negative feedback loops. Science 321(5885): 126-129 (2008) Tsang J, Zhu J, van Oudenaarden A. MicroRNA-mediated feedback and feedforward loops are recurrent network motifs in mammals. Mol Cell 26(5): 753-767 (2007) Tsong AE, Tuch BB, Li H, Johnson AD. Evolution of alternative transcriptional circuits with identical logic. Nature 443, 415-420 (2006) 138 Ubersax JA, Ferrell JE. Mechanisms of specificity in protein phosphorylation. Nature Reviews Molecular Cell Biology 8: 530-541 (2007) von Dassow, G., Munro, E. M., Odell, G. M. The segment polarity network is a robust developmental module. Nature 406:188-192 (2000) Wagner A. Robustness and evolvability in living systems. Princeton University Press. (2007). Waks, Z. and Silver, PA. Engineering a synthetic dual organism system for hydrogen production. Applied and Environmental Microbiology. 75:1867-75 (2009) Werstruck G, Green MR. Controlling gene expression in living cells through small molecule-RNA interactions. Science 282: 296-298 (1998) Win MN, Smolke CD. Higher-order cellular information processing with synthetic RNA devices. Science 322, 456-460 (2008) Yeh BJ, Rutigliano RJ, Deb A, Bar-Sagi D, Lim WA. Rewiring cellular morphology pathways with synthetic guanine nucleotide exchange factors. Nature 447: 596-600 (2007) Yi R, Poy MN, Stoffel M, Fuchs E. A skin microRNA promotes differentiation by repressing 'stemness'. Nature 452: 225-229 (2008). You L, Cox RS III, Weiss R, Arnold FH. Programmed population control by cell-cell communication and regulated killing. Nature 428(6985): 868-871 (2004) Young DD and Deiters A. Photochemical control of biological processes. Org Biomol Chem 5: 999-1005 (2007) 139