Mitigating the Effects of Ribosome Limitations on Synthetic Circuits via High-Gain sRNA-mediated Negative Feedback by John Elias Yazbek B.S., Massachusetts Institute of Technology (2013) Submitted to the Department of Biological Engineering in partial fulfillment of the requirements for the degree of CI(0 0 Master of Science in Biological Engineering at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY June 2015 @ Massachusetts Institute of Technology 2015. All rights reserved. A uthor .............. Signature redacted NK\ Depar\mT-nt of Biological Certified by........ Engineering ianature redacted May 8,2015 Domitilla Del Vecchio V Associate Professor Sig nature redacted Thesis Supervisor Certified by.......... -7// Accepted by ............ / C=) c r . - 4 Timothy K. Lu Associate Professor Theis Supervisor Signature redacted........... Forest White Associate Professor of Biological Engineering Chair of Graduate Program M w. Mitigating the Effects of Ribosome Limitations on Synthetic Circuits via High-Gain sRNA-mediated Negative Feedback by John Elias Yazbek Submitted to the Department of Biological Engineering on May 8, 2015, in partial fulfillment of the requirements for the degree of Master of Science in Biological Engineering Abstract Resource limitations in bacterial cells can present significant hurdles that preclude correct synthetic circuit behavior. In a simple circuit with one constitutively expressed protein and one protein whose expression is inducible, it has been shown that inducing the expression of the second protein causes a significant decrease in the level of the first. In this thesis, we explore the possibility of reducing the effects of resource limitations by adding a high-gain negative feedback loop to one of the circuits. The loop includes an sRNA construct. We explore different implementations of this circuit and model them mechanistically. Furthermore, we begin physically implementing one of the circuit designs by testing intermediate constructs. Finally, we also explore the hypothesis that exogenous circuits on plasmids compete for a pool of resources that is spatially separated from the resources that the genome utilizes. Through our work, we show results that support the spatial separation hypothesis. Thesis Supervisor: Domitilla Del Vecchio Title: Associate Professor Thesis Supervisor: Timothy K. Lu Title: Associate Professor 3 4 Acknowledgments Over time, I have come to appreciate the importance of community in a person's life. As I think of whom to acknowledge in this thesis, my mind immediately drifts to all of the communities that I have come across and that have influenced who I am today. First and foremost, I would like to acknowledge and thank the first community that I ever came across: my family. My parents sacrificed so much for us to live a comfortable life and to get the best education possible. I can only aspire to emulate them. Moreover, I would like to thank my siblings for always being supportive of who I am and what I do. I consider myself lucky to have shared a big part of my life with them. Furthermore, I would like to thank the communities that I have come across here at MIT and who became my family away from home. Whether it's my close friends group, the Number Six Club, the Graduate Students Council or any other MIT group I was part of, I always appreciated the opportunity to be around people who inspired me daily with their dreams and aspirations and encouraged me to dream of my own. As an undergraduate student and a graduate student, I have come across brilliant people who were mentors and role models. Prof. Del Vecchio is one of the brightest yet most humble people I have ever met and the best graduate supervisor I could have asked for. I can never be thankful enough for her kind actions and understanding when I was sick during my first year as a graduate student. Prof. Tim Lu was an excellent undergraduate research supervisor and I am forever grateful for him giving me the opportunity to work in his lab. Also, I would like to thank the Del Vecchio lab and the Lu lab for their guidance during my research career. Dr. Piro Siuti, whom I worked with in the Lu lab, is the big brother that I never had and the best mentor I could have asked for. He is one of the kindest and nicest people that I have met at MIT and who have left a big impression in my life. Furthermore, I would like to thank Andras Gyorgy, whom I worked with on the resource allocation project, and Hsin-Ho Huang who was a great resource through my time at the Del Vecchio lab. Also, I would like to acknowledge Amar Ghodasara from the Voigt lab for allowing 5 us to use his sRNA system in my project. Also, I would like to thank Prof. Wittrup for being a great undergraduate academic adviser and mentor and Dr. Shorn Goel for helping me start my career in research. Finally, this work wouldn't have been possible without the NIH P50 GM098792 grant. 6 Contents 15 . . 15 1.2 Context Dependence in Synthetic Biology . . . . . . . . . . 17 1.2.1 General Problem . . . . . . . . . . . . . . . . . . . . 17 1.2.2 Compositional Context . . . . . . . . . . . . . . . . . 18 1.2.3 Environment Context . . . . . . . . . . . . . . . . . . 19 1.2.4 Host Context . . . . . . . . . . . . . . . . . . . . . . 19 1.2.5 Resource Allocation Problems is Synthetic Biology . . 20 1.3 Small Regulatory RNAs in Bacteria . . . . . . . . . . . . . . 22 1.4 Synthetic sRNA System . . . . . . . . . . . . . . . . . . . . 26 1.5 Gene Knockdown and Integration in K-12 E.coli Cells . . . . 27 1.6 D IA L strains . . . . . . . . . . . . . . . . . . . . . . . . . . 30 1.7 Feedback and Its Properties . . . . . . . . . . . . . . . . . . 31 1.8 Downfalls of Direct Negative Feedback . . . . . . . . . . . . 32 1.9 Thesis Question . . . . . . . . . . . . . . . . . . . . . . . . . 32 . . . . . . . . . Synthetic Bio Introduction and Background . . . . . . . . . . . 1.1 System to be built 35 General System . . . . . . . . . . . . . 35 2.2 Iterations to arrive at the final design . 37 . . 2.1 . . . . . . . . . . . . . 2.2.2 Design 2: Bicistronic Production of GFP and Transcription 2.2.3 . Design 1: GFP-TF fusion . 2.2.1 37 Factor . . . . . . . . . . . . . . 40 Design 3: T7 Split Polymerase . 43 . 2 Introduction and Background . 1 7 2.3 Final Design Choice . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 2.4 Final Design Mechanistic Model . . . . . . . . . . . . . . . . . . . . . 47 2.4.1 Reactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 2.4.2 Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 2.4.3 Model Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 2.5 3 Initial Testing and Test Constructs 4 5 57 3.1 Methods . . . . . . . . . . . . .. .. 3.2 GL construct .. . . . . . ........... .. .. ... ..... . . . . . . . . . . . . . . . . . . 57 58 3.2.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 3.2.2 Characterizing the construct . . . . . . . . . . . . . . . . . . . 59 3.2.3 Testing the construct . . . . . . . . . . . . . . . . . . . . . . . 61 3.3 RV2 construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 3.4 S-test construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 3.5 SV2 construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 3.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 Constructs to be built 69 4.1 sRNA plasmid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 4.2 GFP-RFP plasmid . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 4.3 Testing and Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 4.3.1 70 Effects of RFP induction on GFP . . . . . . . . . . . . . . . . Resource Localization in Bacteria 73 5.1 Problem Statement and Hypothesis . . . . . . . . . . . . . . . . . . . 73 5.2 Chromosomal Integration of GFP . . . . . . . . . . . . . . . . . . . . 74 5.3 gapA-RFP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 5.3.1 79 5.4 Testing and Results . . . . . . . . . . . . . . . . . . . . . . . . TetR construct 5.4.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 Testing and Results . . . . . . . . . . . . . . . . . . . . . . . . 81 8 5.5 6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusions and Future Directions 9 82 83 10 Trans Acting sRNA I1l . . . . . . . . . . . . . . . . . . . . . . . . . 24 1-2 Gene knockout procedure 121 . . . . . . . . . . . . . . . . . . . . . . 28 1-3 pKD20 plasmid 12] . . . . . . . 29 2-1 General system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 2-2 GFP-Transcription factor fusion . . . . . . . . . . . . . . . . . . . . 37 2-3 GFP-Transcription factor fusion simulation results . . . . . . . . . 39 2-4 Bicistronic Production of GFP and Transcription factor . . . . . . . . 40 2-5 Bicitronic Production Simulation Results . . . . . . . . . . . . . . . 42 2-6 The Effects of Changing the cooperativity . . . . . . . . 42 2-7 D esign 3 . . . . . . . . . . . . . . . . . . . . . . . . . . 44 2-8 RFP production . . . . . . . . . . . . . . . . . . . . . . 47 2-9 GFP production . . . . . . . . . . . . . . . . . . . . . . 48 2-10 Figure 2.10: Low G . . . . . . . . . . . . . . . . . . . . 52 2-11 Figure 2.11: Medium G . . . . . . . . . . . . . . . . . . 53 2-12 Figure 2.12: High G . . . . . . . . 54 . . 1-1 . List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . GL plasmid . . . . . . . . . . . . . 58 3-2 Fluorescence Results of GN, GL, GS 61 3-3 Figure 4.2: fluorescence Results of GL in pZE with different levels of ate 62 3-4 RV 2 plasm id . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 3-5 GFP fluorescence of GL+RV2 plasmids . . . . . . . . . . . . . . . . . 63 3-6 RFP fluorescence of GL+RV2 plasmids . . . . . . . . . . . . . . . . . 64 3-7 RFP fluorescence of GL+RV2 plasmids in extended exponential growth 65 . 3-1 11 3-8 Figure 4.3: S-test plasmid . . . . . . . . . . . . . . . . . . . . . . . . 65 3-9 SV 2 plasm id . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 3-10 Effect of AHL induction of the sRNA on GFP fluorescence . . . . . . 68 4-1 GFP-RFP plasmid . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 5-1 Linear DNA piece . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 5-2 Gel Electophoresis results of digestion reaction . . . . . . . . . . . . . 76 5-3 Electroporation colonies plated on different antibiotic containing plates 77 5-4 Gel electrophoresis results of the PCR verification reactions . . . . . . 78 5-5 gapA-RFP plasmid . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 5-6 GFP and RFP fluorescence of gapA-RFP plasmid with chromosomal GFP ......... 5-7 .................................... Results of ate inducing tetR inhibited chromosomally integrated GFP 12 80 82 List of Tables 2.1 Table 2.1: Mechanistic Model Parameters . . . . . . . . . . . . . . . . 13 51 14 Chapter 1 Introduction and Background 1.1 Synthetic Bio Introduction and Background Synthetic Biology is an emerging discipline whose ultimate goal is to allow scientists to engineer cells with improved natural functions or completely novel ones. The discipline brings together the expertise of various well-established fields such as computer science, electrical engineering, mechanical engineering, chemical engineering etc. and applies that knowledge to the novel context of biological design. The standard synthetic biological design is composed of a circuit that takes in certain events as inputs , and results in the desired output. The inputs could be small molecule inducers [3] light [4],or changes in environmental or cellular conditions 15]. Moreover, the circuits are usually composed of standard characterized parts. There has been significant efforts to create a global directory that stores information about each part as well as the actual physical implementation of that part. The most commonly used one is the iGem Registry of Standard Biological Parts [6]. While the most common output used is a reporter protein that is easily quantifiable, the output of those synthetic circuits could also be drug production [7], phenotypic changes [81, etc. Most synthetic circuits are built on a circular plasmid which is termed a vector. A vector usually contains an origin of replication and an antibiotic resistance gene. The origin of replication is recognized by the cell's endogenous DNA replication machinery and dictates the concentration of this plasmid, or the copy number that is present in 15 the cell. The resistance gene confers immunity to a certain antibiotic so that when the bacteria are transformed with a plasmid, cells that have taken up the plasmid can be selected for by adding the antibiotic that kills all the cells that do not contain the plasmid. Moreover, the most common chassis used for most of those circuits are Escherichia Coli (E. coli). This is one of the most studied model organisms and is fairly robust. Circuits on plasmids are transformed into the E.coli cells via heat shock or electroporation. In its nascent years, synthetic biology has seen the creation of various circuits and circuit elements that mimic the behavior of numerous well-established electrical engineering circuit elements. The most common ones have been logic gates. Logic gates are circuits that take in multiple inputs and compute an output based on those inputs [9]. For example, a 2-input AND gate gives a specified outcome if both inputs are present. The Lu lab at MIT showed that it is possible to construct all 16 two input Boolean logic gates by varying the combination and arrangement of a promoter, a reporter gene and a terminator. Additionally, one of the more famous implementations of synthetic biology has been the toggle switch that was developed by the Collins lab [101. In their design, they constructed a circuit composed of two repressors, where each repressor inhibited the expression of the other. By using specific inducers that bind the repressors and inhibit them from blocking the promoter, they were able to toggle between two different states. Furthermore, scientists in the Elowitz lab expanded this design to create a three inhibitor cycle that acts as an oscillator [111]. In the long run, synthetic biology aims to revolutionize various areas in the therapeutic, diagnostics, biotechnological and environmental fields. Within the therapeutics domain, synthetic biology could bring promise of rationally designing cells that could find, attack and kill cancer cells [12]. Moreover, scientists are also engineering human cells whose main function is to correct cellular defects that often result in disease. For instance, the Weiss lab is working on engineering stem cells that give rise to insulin-producing beta cells that aim to cure diabetes [13]. Additionally, synthetic bacteria could be used as diagnostic tools to detect certain antigens or chemicals in the body. Those detectors could also be used to in the environment to detect toxins 16 and even respond to those environmental triggers [14]. Furthermore, synthetic biology could be applied to more complex organisms such as plants in order to engineer smart plants that are robust to changes in the environment and can respond accordingly. Finally, on the biotechnological side, cells in bioreactors can be engineered via synthetic biology to produce more products at the right times, and to be more robust to the reactor conditions or to make biofuel production more efficient [151. While the promise of synthetic biology seems to be potentially revolutionary, we are still a long way from seeing concrete practical applications. Most of the focus currently is on generating proof of principle circuits and circuit elements and addressing some of the key issues that arise when dealing with a system as complex as a cell. Moreover, as the repository of available circuit elements continues to expand, larger and more complex circuits can be designed. However, this leads to novel problems that need to be addressed in order to build properly functioning systems. While a bread board that is used to prototype circuits in electrical engineering is a standard design that shows little variability across different boards, no two cells are ever the same. Cells have varying amounts of proteins and varying sizes and volumes. Therefore, in addition to characterizing a circuit and addressing all the problems that are related to the circuit elements, scientists need to be aware of the additional problems that the host introduces. 1.2 1.2.1 Context Dependence in Synthetic Biology General Problem Cellular growth and behavior is highly dependent on the context in which the cells are growing. This could be media in which the cells are found and its chemical composition as well as the concentrations they are found at and how long they are incubated in the media [161. It could also be the temperature of the environment, the amount of light the cells are getting or the presence of certain toxins or antibiotics 117]. Additionally, cells can sense the concentration of cells around them and that could 17 impact their growth rates as well as other bacterial activities such as biofilm formation 118]. In summary, the list of contextual factors that can influence cellular behavior is very large and this creates a whole set of novel problems in synthetic biology. While bacteria and other organisms have evolved over millions of years to optimize their endogenous circuits and networks in response to these context problems, heterologous synthetic circuits are often designed without taking the context into consideration. As synthetic circuits get more complex and require more parts than ever, these contextual problems emerge as critical issues that need to be addressed in order to achieve the desired system behavior. These issues were summarized in the review published by Cardinale and Arkin titled Contextualizing context for synthetic biology- identifying causes of failure of synthetic biologicalsystems 119]. In this review, the authors divided the sources of context dependencies into three major groupings: 1) Compositional Context 2) Environmental Context 3) Host Context. 1.2.2 Compositional Context Generally, synthetic designs consist of parts that, when grouped together, result in a particular behavior in the cell. These parts are usually DNA sequences that are themselves effectors (such as promoters or ribosome binding sites) are transcribed into RNA with cellular functions (such as RNAi or sRNA) or are eventually translated into effector proteins. These DNA sequences are arranged on one or several plasmids or DNA molecules in a specific order. This gives rise to several compositional context problems. First, the DNA sequences which have been studied and characterized in isolation are grouped together in a novel arrangement. This could lead to certain parts interfering with one another. For example, the regulatory sequences of a promoter could end up getting transcribed on the messenger RNA and affecting its translation. A solution to this problem was addressed via the addition of a ribozyme sequence that cuts off 5' sequences from the messenger RNA [201. Furthermore, a transcription factor binding to promoters could be highly susceptible to the surrounding DNA sequences as they might alter its binding affinity. Additionally, the ordering of the parts has a significant effect on circuit behavior. In a recent publication, Cox et al. 18 [21] showed that changing the ordering of promoter elements could result in a wide range of behaviors. 1.2.3 Environment Context The environment in which the heterologous circuit is inserted could also generate many issues that interfere with the proper functioning of the circuit. For instance, cells are directly affected by the media or the external physical environment in which they are growing. The temperature plays an important factor here. Many promoters operate in a narrow temperature range and this has been used to extensively to control gene expression by placing them under the regulation of a temperature sensitive promoter or temperature sensitive expression systems 122]. However, this can be a problem when uncontrolled temperature fluctuations affect the promoter function. Additionally, the contents of the media could have a significant impact. For instance, certain promoters such as pBAD only induce expression in the presence of an inducer and in the absence of glucose in the media. Moreover, the environment context problems could result from the host cell itself. For instance, cell volume and shape as well as the concentrations of metabolites and other resources vary greatly depending on the cell growth phase as well as other factors. These changes could significantly alter the conditions for which the synthetic circuit was designed. Furthermore, additional problems arise when certain circuits, which are characterized well in isolation, are connected to a downstream circuit. Problems that arise are termed loading effects when the downstream binding of a circuit element causes unwanted upstream effects such as delays or loss of oscillations in the specific case when the upstream system is an oscillator /citedel2015modularity. 1.2.4 Host Context Finally, the third group of contextual problems are related to all that is encompassed within the cell wall. When the exogenous synthetic circuit is inserted into the cell, the circuit elements are subject to both favorable interactions with cellular contents, 19 such as interactions with RNA polymerases (RNAP) and ribosomes that are needed for effector protein expression, as well as non-favorable interactions such as the binding of endogenous proteins to the circuit. Moreover, effector proteins can bind to endogenous sites which in turn causes load effects on the circuit. This phenomenon is termed retroactivity and has been shown to be a major cause of circuit failure[23]. For instance, when one component of an oscillator has other binding sites within the cell, the resulting load effects could preclude oscillations from taking place [24]. Furthermore, even the favorable interactions can generate context problems. When exogenous circuits are inserted into cells, they share a limited set of resources such as RNAP and ribosomes. While the cells have evolved to optimize their endogenous circuits to share those resources efficiently, resource allocation problems are not taken into account when designing synthetic circuits. 1.2.5 Resource Allocation Problems is Synthetic Biology The problem of resource allocation in synthetic biology has been characterized experimentally and documented in several journal articles /citegyorgy2014limitations [25] [26]. In the article A framework and model system to investigate linear system behavior in E. Coli [27], the authors sought to study the coupling of synthetic circuits with each other and with the host system that resulted in non-linearities in behavior. The authors make use of what they termed DNA devices which are simple synthetic circuits consisting of a promoter, ribosome binding site, gene of interest and transcriptional terminator. Linear behavior occurs when the output of both devices when present at the same time is simply the sum of the outputs observed for each device when it is present alone in the system. The output measured here is the concentration of RNA produced by the device. The authors then plotted the RNA levels versus the plasmid copy number (DNA level) of each device for different combinations of devices to probe whether linear behavior was observed or not. What they saw was that linear behavior was observed when two devices (one expressing nptII and the other expressing cat) were present in the cell. However, when a Green Fluorescent Protein (GFP) device was added to make a total of three devices, the authors observed non-linear 20 behavior for the GFP expression. When the authors analyzed the degradation rates of gfp and cat transcripts, they found that there was no appreciable decrease in those rates. This meant that the non-linearity resulted from the modulation of the synthesis rates. Therefore, although the three devices were not directly connected, they greatly influenced each other, especially at higher DNA copy numbers. The authors hypothesized that these non-linearities could be dealt with by exploring the effects of promoter strength and ribosome binding site (RBS) strength. Consequently, Jimenez et al. 128] from the Del Vecchio lab built on those results and began exploring the effects of promoter and RBS strengths on non-linearities resulting from resource limitations. The system they used to do this consisted of two circuits that were inserted into E.coli. The first circuit contained green fluorescent protein (GFP) under the control of a constitutive promoter. The second contained red fluorescent protein (RFP) under the control of a lux promoter. The lux promoter was induced when N-Acyl Homoserine Lactone (AHL) was added to the growth media. AHL binds to luxR and turns on the lux promoter. Therefore, the two circuits were not directly connected. When the cells were grown in the absence of AHL, they expressed GFP. When the cells were induced by AHL, they expressed both GFP and RFP. However, the level of expressed GFP dropped below that when AHL was absent. Moreover, the more AHL was added, the higher the steady state level of RFP was and the lower that of GFP became. Therefore, while the two circuits were supposed to be independent, the experimental results revealed that there was an indirect coupling between the two. This brings us back to the non-linearities that were observed in the Hajimorad paper. Moreover, this coupling is a result of the limited resources in the cell that are shared among the two resources. When GFP is being produced, it only shares resources such as RNA polymerase (RNAP) and ribosomes with the endogenous circuits. However, in the presence of an active second circuit, some of these resources are diverted away from the GFP circuit, resulting in a decreased expression. An extensive theoretical analysis of the system revealed that the main bottleneck here was the ribosome limitation. Moreover, simulations suggested that changing certain system parameters could restore robustness to the 21 system and decouple the two circuits. Hence, the expression of GFP would not be affected by that of RFP. When the authors plotted the mean fluorescence values of GFP versus RFP, the observed a straight line with a negative slope. This line was termed the isocost line. This term is extensively used in microeconomics to describe how to properly allocate a limited budget between two items to be purchased. Analytical analysis of the system revealed that it can be made more robust to resource limitations by either increasing the binding affinity of the ribosome binding site (RBS) or increasing the promoter strength (by increasing its affinity to the RNAP). This result corroborates Hajimorad's hypothesis that these two parameters are important in decoupling the two circuits. These predictions were further validated via experimentation. Plotting the isocost lines for various promoter strengths and various RBS.affinity showed that stronger promoter strengths and stronger affinities to the RBS resulted in a flatter isocost line. This meant that GFP expression was less sensitive to RFP expression. Therefore, the effects of resource limitations can be mitigated by rational system design and choosing the right parts. Moreover, in this work, we aim to further investigate whether it is also possible to mitigate this problem by altering the system architecture instead. 1.3 Small Regulatory RNAs in Bacteria Regulation of gene expression is one of the most studied fields of biology. By modulating the levels of different proteins or cellular elements, organisms can properly respond to exogenous and endogenous events in a manner that ensures survival. Over the past century, a great deal of the research has been done on transcription factor mediated regulation in bacteria. However, there has been a recent surge in interest in other modes of regulation such as RNA mediated regulation. This is a direct result of the recent technological advances such as deep sequencing that enabled scientists to sequence and annotate whole genomes of organisms and enabled them to also quantify and sequence the RNA present in cells at different conditions. RNA mediated 22 regulation is accomplished by relatively short single stranded RNAs whose sizes range between 50-300 nucleotides. Small RNAs, or sRNA, in bacteria have been recognized for their various roles in the cells. For instance, they help regulate plasmid copy number and have been shown to block ColEl Plasmid replication through the action of RNA 1 [29]. Moreover, they have also been found to control the transposition of insertion elements 130]. They also control the cellular responses to external changes in nutrient levels. For example, E.coli response to a decline in iron levels is established via sRNA action [31]. The mechanisms of action of sRNA are very diverse and could be classified into different sets based on that. The three classes of sRNA that have been extensively studied are cis sRNA, trans sRNA, and CRISPR sRNA. Cis sRNAs are encoded on the antisense strand of their target and bind in perfect complementarity with the target mRNA [32]. While the sRNA and the mRNA are located on the same DNA location, they are expressed as separate elements with the mRNA being expressed in the sense direction and the sRNA in the anti-sense. The initial interaction between the cis sRNA and the mRNA occur through the complementarity of a few base pairs and some structural elements. The sRNA first recognizes its target through rapid and strong interactions between the nucleotides that are exposed in the stem-loop of the sRNA, the target or both [331. Moreover, the general role of cis RNA is to modulate the levels of the cellular elements, such as plasmids or transposons, from which they are expressed. In addition to cis sRNA, clustered regularly interspersed short palindromic repeats (CRISPR) RNAs are another class of sRNAs that have been recently discovered and are now receiving a lot of attention. The RNAs consist of several repeats that are separated by variable sequences that are termed spacers [341. These RNAs provide resistance to bacteriophages and is regarded as a form of immunity to foreign DNA. The CRISPR system is reminiscent of the siRNA system that has been discovered in mammalian cells. siRNA serves to regulate the expression of certain genes in mammalian cells, a process that is often referred to as gene silencing. A third class of sRNA consists of trans acting RNAs. These RNAs are expressed at different loci than their targets and have their own promoter and terminator. They 23 sRNA repressor gene _ _ am target gene target mRNA Figure 1-1: Trans Acting sRNA [1] The figure shows the target mRNA being transcribed at a rate of a. and the sRNA at a rate of a,. The lumped parameter k describes the binding of sRNA to the mRNA as well as the subsequent degradation of both. bind to their targets by imperfect base-paring to a short segment on the target 133]. The majority of trans sRNA are single stranded and consist of three functional regions. The 5' initial region is called the seed region and that is the site that is complimentary to the target mRNA. It has also been shown that these seed regions are sufficient to guide target binding. Moreover, a second region is the Hfq protein binding site. Hfq is the chaperone protein that assists the sRNA in binding to its target and protects it from degradation. Finally, the third region is a stem-loop at the 3' end that acts as a transcriptional terminator and also helps protect the sRNA from the action of endonucleases. The seed region binds in non-perfect complementarity to its target mRNA. This means that one sRNA molecule can bind different targets. Through this multiple target interaction, sRNA have been shown to regulate not just one gene but whole networks. For instance, low iron environments can trigger the expression of the sRNA RhyB which in turn down-regulates the expression of iron-storage and iron using proteins [351. The chaperone protein Hfq is critical for the proper functioning of the trans-acting sRNA. Hfq usually binds to RNA sites with a short stretch of uridines and adenosines that are located before or after a stem-loop [361. Moreover, Hfq facilitates the sRNAmRNA interactions and enables the formation of the bound complex that prevents the ribosome from binding and inhibits translation. In addition to that, sRNAs have 24 been shown to be more stable in the presence of Hfq and it is hypothesized that Hfq shelters the sRNA from degradation. However, once bound to the target mRNA, Hfq recruits the RNA degradation machinery through interactions with RNAse E, as shown in Figure 1-1[32]. Regulation by trans acting sRNA has been shown to produce novel expression profiles. The direct action of sRNA binding to a target mRNA is to reduce the expression of the mRNA by isolating it and inhibiting ribosomes from binding to it. Moreover, the binding of the Hfq-sRNA to the target increases the rate of degradation of both molecules. Therefore, the mRNA suppression is non-catalytic in nature since the target and sRNA are degraded in a one to one ratio[37l. In other words, one sRNA molecule is consumed for every target mRNA that is degraded. This mechanism of action results in a threshold linear mode of action. This means that below a threshold mRNA expression, the gene expression is repressed. However, above this threshold, the expression increases in a linear fashion. The theoretical work from the Hwa lab explores this method of action. In their models of sRNA repression, they consider the case where the rate of RNA expression of the mRNA and sRNA are different. If the rate of sRNA expression is greater than the mRNA expression, then any mRNA is expressed will be degraded. However, as the mRNA expression rate is increased, a point is reached where both rates are equal and that is the threshold. Beyond this level, the sRNA levels are not enough to repress all of the mRNA produced, and the leftover mRNA will be expressed. Furthermore, as the target mRNA expression rate is increased, this expression level will increase linearly with it. Moreover, this threshold response permits for a rapid response while filtering transient signals [1]. When the signal that is inducing sRNA expression, and hence repression of the target, disappears, target expression takes place at a time tm after the sRNA pool has been degraded and mRNA has accumulated and is translated. This occurs on a much faster time scale than protein regulation. Moreover, any changes in target expression rate that occur at a time less than tm are filtered out by the sRNA. Synthetic biologists have been particularly interested in the sRNA system for several reasons. First, the fact that they act in trans makes it easier to insert them into 25 cells via plasmids. For instance, if the goal was to knockout a gene in a metabolic engineering effort, the relevant sRNA could be clones on a plasmid and then introduced into the cell and expressed in trans. This is in contrast with cis acting sRNA which would need to be encoded on the chromosome itself. Second, the kinetics of sRNA repression are much faster than those of protein and transcription factor repression. Finally, and arguably most importantly, the modular structure of the sRNA molecule itself makes it possible to change certain parts to suite the situation while keeping the rest intact. For instance, there has been a lot of research into making sRNA a modular part that can be used in synthetic circuits. Sang Yup Lee lab at KAIST published a protocol [38] for assembling sRNA that target any sequence. This is achieved by changing the seed region of the sRNA to a sequence that is complementary to the translation initiation region (TIR) of the mRNA while keeping the Hfq-binding scaffold region intact. However, this approach has one drawback. As mentioned before, sRNA interactions are imperfect. The paper mentions that one needs to optimize the seed region in order to get the desired level of repression. This could be labor intensive and could lead to off target inhibitions. 1.4 Synthetic sRNA System On the other hand, Amar Ghodasara from the Voigt lab at MIT developed a more robust system. His system is composed to two elements: the sRNA and the target sequence. Also, the sRNA was made up of two modules: the target binding sequence and the scaffold sequence. In his system, he first characterized the natural sRNA scaffolds with the strongest binding affinity to Hfq and thus the strongest inhibitory effects. Furthermore, he designed short target sequences ( 15bp) that would be placed right before the target mRNA sequence and that would bind a homologous sequence on the sRNA (the target binding sequence). Moreover, Amar made sure that the taget binding sequences had minimal off-target effects. Thus, this setup can be easily applied to any system. For instance, the target 15bp sequence can be easily added to the plasmid upstream of the gene to be silenced. Then, the sRNA with the ho26 mologous sequence can be expressed in trans and will consequently bind the mRNA through base-pair interactions and the Hfq protein through the scaffold. 1.5 Gene Knockdown and Integration in K-12 E.coli Cells Integrating genes or operons into the genome of bacteria has many benefits. First, it allows for the stable incorporation and long-term maintenance of that operon since plasmids are known to be lost from the cells after some generations. Second, in synthetic biology, integrating circuits into the genome enables biologists to transform cells with multiple circuits, since transforming cells with too many plasmids can be problematic and can interfere with cell growth. The traditional means of incorporating DNA into the genome is transforming cells with linear DNA that has homology to a certain location in the genome and then allowing either endogenous or exogenous recombinases to integrate the DNA. However, there are several issues with that. The main issue was that intracellular exonucleases in E.coli ultimately degrade the linear DNA before it is integrated. This issue has been circumvented by using mutants that lack the exonuclease. The highly cited paper by Datsenko and Wanner [21 proposes a novel system for high efficiency inactivation of chromosomal genes in K-12 E.coli cells. The system is termed the Red Disruption system. As shown in Figure Y, the system is present on a plasmids termed pKD20. Cells are transformed with a linear piece of DNA with homology on the 5' and 3' ends alongside this plasmid. The plasmid contains three genes: Gam, Bet and Exo that together inhibit the exonuclease and promote recombination. This plasmid is also temperature sensitive and low-copy. It can be easily cured at 37C. Moreover, the three genes are inducible and are under the action of pBAD promoter. Therefore, recombination only occurs in the presence of the activator arabinose. Moreover, the authors have updated their plasmid and now use the new version pKD46 that yields a greater number of recombinants. The 27 Step 1. PCR amplify FRT-flhnked resIance gene Step 2, Transform strain expressing) Red recombinme GOW A Gem C GenS Step 3. Select antIbiotIc-resistant transforvanls [] GeneaA Gen C Step 4. ElIminate resistance cassette using a FLP expression pia"mWd GM A I'le1 m*C Figure 1-2: Gene knockout procedure 12] This figure summarizes the gene knockout procedure described by Datsenko et al. [2J. Primers P1 and P2 are used to amplify the antibiotic resistance gene. The primers have overhangs that are homologous to the DNA sequence outside of the gene to be knocked out. After the cells are transformed with the linear DNA and recombination takes place, cells are selected for antibiotic resistance. Only those that have successfully integrated the resistance gene instead of the original will grow. Furthermore, the resistance gene can be removed by using an FLP expression plasmid that causes recombination between the FLP sequences surrounding the resistance gene. 28 a.' araC bla i'SWRI PC pKD20 oriR1O1 ParaB Y 6078 bps aS repA101te MvII Figure 1-3: pKD20 plasmid [2] The pKD20 plasmid contains the genes that encode for the functional enzymes that prevent the degradation of the linear DNA as well as the ones that cause the recombination. linear piece of DNA that is transformed with the pKD plasmid contains the gene for Kanamycin resistance that is flanked by FRT sequences. FRT sequences are recognition sites for the FLP recombinase. Moreover, this sequence is also flanked by homologous sequences at the 3' and 5' ends that are homologous to the ends of the gene that is to be knocked out. When this stretch of linear DNA is transformed with the pKD plasmid and induced with arabinose, the recombinases allow the kan-FRT sequence to replace the gene of interest. Then, the pKD plasmid can be cured by increasing the temperature. This will allow the linear piece of DNA to be degraded, since there are no longer proteins to block the exonuclease. Moreover, the cells that have successfully knocked out the gene of interest can be selected for by plating on kanamycin containing plates. Finally, kanamycin resistance gene can be removed by transforming with a plasmid containing FLP recombinase. The paper cites a method of knocking out a gene by replacing it with kanamycin 29 resistance gene. However, the same protocol can be followed to integrate a gene into the genome. This can be done by inserting the gene of interest alongside the kanamycin resistance gene and within the homologous sequences. This will ensure that the gene is inserted into the genome together with the kanamycin resistance gene. 1.6 DIAL strains In their work, Jimenez et al. made use of DIAL strains [391. DIAL stands for DIfferent ALleles. They were developed by the Anderson Lab in UCBerkeley. What is particularly useful about these strains is the ability to control the copy number of plasmids to a high degree of precision. Varying gene or plasmid number is an arduous task that often relies on transforming the same gene or circuit on various plasmids with different copy numbers. Moreover, the copy numbers of those plasmids was often highly variable. Therefore, this approach was both time-consuming and unreliable. The Anderson lab developed a set of strains that, when transformed with a ColE2 or R6K plasmid, yield a specified plasmid copy number. In particular interest to us are the JTK160 strains that were developed in the lab and that control the copy number of ColE2 plasmids. There are 10 JTK160 strains labelled JTK160A to J, with A giving the lowest copy number and J the highest. Plasmid replication from the ColE2 origin of replication is controlled by the action of the trans-acting protein RepA. RepA was placed under the action of a constitutive promoter and integrated into the genome of K-12 cells using the protocol of Datsenko and Wanner. Moreover, the ribosome binding site (RBS) of the protein was varied to modify the expression level of the protein. Then, 10 strains were characterized that produced varying levels of RepA and that ultimately yielded a range of ColE2 plasmids that was from 1-100 copies. Figure X shows the template of the construct used to integrate RepA into the genome. Here, 5'HA and 3'HA are the 5' and 3' sites homologous to the gene they're replacing in the genome. Pcon and Trans refer to the constitutive promoter that is driving the expression of the trans-acting protein repA. FRT sites flank the 30 kanamycin resistance gene and they are recognition sites for the FLP recombinase. Kanamycin gene is inserted to ensure that they could select for transformants that underwent recombination and the FLP sites are necessary to remove the Kan gene later on. 1.7 Feedback and Its Properties The term feedback refers to the connection between two (or more) systems where one system influences the other(s) and vice versa and the dynamics of the systems are strongly coupled 125]. Generally, the term feedback is used in contexts where the output of a certain system feeds back to the input and influences system behavior in a way that ultimately modifies the output itself. There are two main types of feedback: Positive and Negative feedback. Positive feedback involves an upregulation or a modification in the system behavior that causes an increase in the output. Positive feedback is sometimes employed by biological systems in order to achieve a very fast and high response. On the other hand, the other type of feedback is negative feedback. Negative feedback takes place when the output of the system feeds back to the input and causes changes in the system that ultimately lead to a decrease in that output. In other words, negative feedback exerts corrective measures when the output level based on the difference between the desired and measured performance. This plays a significant role in the vast number of systems in which it is employed. The main effect that negative feedback plays in those systems is a stabilizing effect where it stabilizes the output at a certain defined level. If the output goes above that value, the increase in output causes a concomitant increase in the feedback which ultimately decreases the output level back to its original values. Similarly, if the output values decreases below that defined values, the feedback effect decreases as well and this leads to an increase in the output level. Introducing negative feedback into a system has several advantages. First and mainly, it makes the system more robust to uncertainty and disturbances that can 31 interfere with system behavior. This is especially relevant in biological systems where many circuits are present in environments that are not well-defined and are constantly changing. Moreover, feedback makes a system more modular. This has significant advantages since a modular system can be easily connected to others without needing to characterize the system in the new context. 1.8 Downfalls of Direct Negative Feedback Building on the work done by Jimenez et al. in the Del Vecchio lab and what we know from control theory, we conjectured that adding a high-gain negative feedback loop to the system could possibly solve the systems failures that result from the perturbations induced by ribosome limitations. This would involve changing the system architecture to incorporate this negative loop. The simplest implementation of that would be to include a direct loop from the GFP to regulate its own production. Since GFP is a reporter and is not a transcription factor, this would mean fusing it to an inhibitor that would bind to the promoter regulating it and cause a decrease in the transcription of GFP, thereby decreasing the amount of GFP produce. However, theoretical representations of this system and simulations revealed that it is not practically feasible. 1.9 Thesis Question In this thesis, we aim to address two questions. The first is whether using an sRNAmediated negative feedback loop could mitigate the problem of negative feedback and is practically feasible. We start from a simple theoretical representation of the system and model it to make sure that the results are generally in the right direction. Next, we explore different practical implementations of the system in question and compare the pros and cons of building each of those. After choosing the optimal system, we modify the model and create a mechanistic model and representation of the current system, using real parameters. After theoretically verifying that the system could 32 achieve the required results, we move on to building that system. The first step of that process is to choose the correct synthetic parts from the Registry of Standard Biological Parts and design how the final system would look like. After that, we started working on test systems that we would use to test those parts and how they would work together, before implementing the final complete system. The second question is whether resources, such as ribosomes, are spatially distributed in a bacterial cell. This theory would suggest that there are local resource pools that are shared by plasmids but are different than those used by genomic genes. For that purpose, we tried to solve the problem of resource competition by putting one circuit on the genome and studying whether the adding additional circuits on plasmids would affect the genomic circuit. 33 34 Chapter 2 System to be built 2.1 General System The system we chose to implement is a modification of the system used by Jimenez et al. As was done in their paper, Green Fluorescent Protein (GFP) is constitutively produced while Red Fluorescent Protein (RFP) is produced following induction by a small molecule inducer. The main difference here is the addition of an sRNA-mediated negative feedback loop. Here, the mRNA coding for GFP is constantly produced by the constitutive promoter. As shown in Figure 2-1, the GFP produced in turn leads to the activation of a promoter that regulates the production of sRNA. The sRNA produced by this promoter is specific to the GFP mRNA and it binds to that mRNA and causes its inactivation and subsequent degradation. Therefore, the net result is a decrease in the amount of GFP mRNA and a concomitant decrease in the level of the protein. This means that an increase in GFP levels will activate a cascade of events that will eventually lead to a decrease in that value. Therefore, GFP is involved in a negative feedback loop that regulates its own production. Moreover, the high gain here comes from the amount of sRNA that is present in the system. This amount can be easily regulated by changing the copy number of the plasmid on which the sRNA is present. In theory, this feedback loop should serve to maintain the level of GFP at a steady value despite the increase in the RFP level and the accompanying decrease in ribosome 35 . Protein ....... .............. Ribosome pool (RFP) Protein I mRNA inducer i ~ (GFP) mRN L, M PF Promoter Promoter RFP GFP Figure 2-1: General system This figure is a cartoon representation of the general system to be implemented. Similar to what was done in Jimenez et al., there is a GFP circuit under the action of a constitutive promoter (right) and an RFP circuit under the action of an inducible promoter (left), Each circuit produces mRNA of the respective gene and these mRNAs compete for the ribosome pool in order to be translated into functional proteins. The main difference here is the addition of the sRNA feedback circuit to the GFP circuit. Here, the translated protein somehow regulates its own production. It activated the promoter that is driving the sRNA production. Consequently, the sRNA binds to the mRNA encoding for GFP and causes its subsequent degradation, thus decreasing the amount of available mRNA and the amount of translated protein. levels. Before RFP production starts and in the absence of its inducer, GFP is at a steady state value that is the result of the equal rates of its production and removal by either natural degradation or the action of the sRNA feedback loop. When the inducer is added to the system and RFP production is initiated, the ribosome pool available to GFP decreases, since a lot of them are now being used to produce RFP. As a result, the level of GFP decreases. This will lead to a decrease in the rate of GFP removal via the sRNA-mediated feedback loop, since less GFP is available to induce sRNA production and less sRNA will be produced. 36 As a result, the rate of GFP formation will be a net positive because the rate of production is greater than the rate of removal. Eventually, more GFP will form until its level goes back to the initial steady state level. Therefore, the net result is that the GFP level was maintained at a defined level and in the long run was independent of the RFP circuit. Therefore, the feedback loop successfully decouples the two circuits. 2.2 Iterations to arrive at the final design In search of a physical implementation of this system, we considered several designs: 2.2.1 Design 1: GFP-TF fusion Description piw' sRNA Protein MMOMpo (RFP) Protein (GFP-LuxR fusion) rnRNA arabinose mRNA V pBad M ptet RFP GFP LuxR Figure 2-2: GFP-Transcription factor fusion This is a specific implementation of the general system. Here, GFP is fused to the transcription factor LuxR and the fused protein both fluoresces and is able to activate the sRNA promoter plux. Figure 2-2 demonstrates one possible implementation of this sRNA-induced feed37 back system. Here, a transcriptional activator (luxR) is fused to the reporter GFP. sRNA expression is under the regulation of the plux promoter. Therefore, in the presence of AHL, the luxR-GFP complex should initiate the transcription of the sRNA. Moreover, the sRNA produced has a homologous sequence to part of the luxR-GFP mRNA and will induce its own degradation as well as that of the mRNA. Therefore, GFP now regulates its own production via a negative feedback loop. Meanwhile, RFP production is regulated by the pBAD promoter. Therefore, RFP is produced in the presence of arabinose, which inhibits the inhibitor AraC. Model and Simulations To verify that this system could potentially work, we implemented a simple model that describes it and we simulated it. The model equations, based on the paper by Levine and Hwa Ill are the following: ds GP" = a P dt p + K dm d= s - kms Gu - ym - kms dt dP= Rm - AP dt Where: s is the concentration of sRNA P is the concentration of luxR-GFP m is the concentration of mRNA K is the binding constant of luxR-GFP to the promoter plux k is the decay rate of the sRNA/mRNA complex u is the constant rate of mRNA production R is a parameter that is related to ribosome availability -y is the decay rate of the mRNA 38 (2.1) (2.2) (2.3) 6 is the decay rate of the sRNA A is the decay rate of the protein a is the difference in promoter strength between the sRNA promoter and the mRNA promoter For the simulation, we allowed the system to reach steady state at a certain level of R. Then, at some time after that, we decreased that value and allowed the system to reestablish steady state. This decrease in R mirrors the decrease in available ribosomes that would accompany the induction of the RFP circuit. After simulating the system, we get the following result: 18" 16 -- Protin P with G=10 Protein Pwith G=100 Protein P with G=110 12 0 8 6 4 2 4 102-0 '6 1 20 30 40 0 6 0 8)~ 60 60 70 80 0 90 10 Time Figure 2-3: GFP-Transcription factor fusion simulation results This graph shows the protein P (GFP) level following a decrease in ribosome availability at time 20. With a small feedback gain, there is a large decrease in the protein's level (blue). However, as the gain is increased, this change in the protein level is significantly reduced (red). As figure 2.3 shows, the system with high feedback gain G was able to retain its initial steady state value when R decreased. However, the system with low feedback gain G showed a significant decrease in the steady state level of the protein. Therefore, the high gain negative feedback loop can mitigate the effects of ribosome limitations. Also, detectable GFP fluorescence requires high levels of the protein and if this is directly associated with the activator, the high levels might saturate the activated promoter and the feedback loop would be broken. 39 2.2.2 Design 2: Bicistronic Production of GFP and Transcription Factor Description 4 46 plux sRNA Prtenibosom pool (RFP) rotei Protein (LuxR) (GFP) /1/ inducer RNA mRNA h-IL plac or pBad E ptet RFP 0 U GFP LuxR Figure 2-4: Bicistronic Production of GFP and Transcription factor This figure shows another implementation of the general system. Here, GFP and LuxR are under the control of the same promoter and there is only one terminator and it is placed after the LuxR gene. Furthermore, each gene has its own ribosome binding site (RBS). Therefore, both genes are transcribed on the same mRNA but are translated separately, resulting in two separate proteins. Figure 2.4 shows the second design iteration. In this design, RFP expression is regulated by the pBAD pr plac promoter. This promoter is inhibited by the AraC protein or LacI respectively, and is induced by arabinose or IPTG which bind the inhibitors and prevent their action. Similar to the design without sRNA regulation, GFP is constitutively expressed by pTet promoter. However, luxR is also produced here. LuxR and GFP are produced on the same mRNA transcript but GFP has a separate ribosome binding site permitting separate translation of GFP and luxR. Moreover, luxR , in the presence of AHL, induces the expression of sRNA, which in the presence of Hfq protein, causes the simultaneous degradation of itself along with 40 the mRNA strand. This steps adds an extra regulation step. The autoregulation of GFP in this case can be modulated by the concentration of arabinose in the growth medium. Model and Simulations Similar to what we did for the other design, to verify that this system could potentially work, we implemented a simple model that describes it and we simulated it. The model equations are the following: GA" ds --- = a -- 6s - kms dt An + K dm dt Gu - ym - kms (2.4) (2.5) dP= Rm - AP dt (2.6) dA= Rm - AA dt (2.7) Where: A is the concentration of the araC activator The simulation results showed the same results as the case with the fusion protein. Therefore, theoretically, having separate proteins could work. As shown in figure 2-5, for high G, there is little variation in the GFP level as the ribosome level is changed. Cooperativity n The models were simulated assuming that there is no cooperativity of binding for the transcription activator, i.e. n=1. However, this could present problems if there is cooperativity. Therefore, we simulated the system for the case where n=2. As shown in Figure 2-6, we got a similar qualitative behavior as before: at high feedback gain, changes in R have little effect on the steady state value of GFP. 41 18 """ProteinP with G=10 161 -" "Protein P with G=100 Protein P with G=1000 14 12 10 8 6 4 2 0 a 10 20 30 40 50 60 70 60 90 100 Figure 2-5: Bicitronic Production Simulation Results This graph shows the protein P (GFP) level following a decrease in ribosome availability at time 20. With a small feedback gain, there is a large decrease in the protein's level (blue). However, as the gain is increased, this change in the protein level is significantly reduced (red). 180 16 - n= 1 - n= 2 -n=3 14 r12 10 4 2 0 0 10 20 30 40 50 60 70 80 90 10 0 Time Figure 2-6: The Effects of Changing the cooperativity the protein P (GFP) level following a decrease in ribosome availability shows This graph at time 20. All three graphs have a high feedback gain. This graph shows the GFP level at 3 different values of the parameter n (1,2,3). All values of n show the desired behavior and that is a stable level of GFP despite the change in ribosome availability. 42 2.2.3 Design 3: T7 Split Polymerase Description This design was based on a paper by the Voigt lab 140.This paper deals with the common problem of resource allocation in synthetic biology, mainly with respect to RNAP use. For circuits with high expression, RNAP use by the circuit can interfere with other cellular functions, especially those related to growth and survival. Therefore, the authors of the paper designed an orthogonal resource allocator that caps the maximum T7 RNAP production and then allocates this pool of orthogonal RNAPs between different circuits. The way this was done was by splitting the T7 RNAP into two functional parts: A core and a DNA-binding unit (which was termed the sigma subunit, because it functions in a similar fashion to the natural bacterial sigma subunit). Then, they mutated the DNA-binding segment of the sigma subunit to direct it to different orthogonal promoters. This resulted in four variants of the sigma subunit, all of which bind to the same core unit but different promoters. Therefore, by controlling the expression of the core unit, they can cap the maximum T7 RNAP levels and those expressed proteins can then be allocated between the different sigma subunits that are expressed. While our design does not involve partitioning the RNAP resource between different resources,this design of a split T7 RNAP can be applied in the following way: the core protein can be expressed from an inducible pBAD promoter. The sigma subunit can be coexpressed with the GFP, such that it becomes an 'activator' to sRNA production, which is under the control of a promoter specific to this orthogonal T7 promoter. (Note that this RNAP is orthogonal to the wild type T7 RNAP that we are also using). Therefore, when the sigma subunit is expressed, it binds to the core protein and activates the expression of sRNA. As shown in the figure, there are several new modifications to the design. The RFP circuit has not changed. RFP is induced by adding arabinose to the medium, which binds to luxR that is constitutively expressed in the cell. LuxR-AHL binds to the plux promoter and induces expression from it. 43 , sRNA t I r PCGG 0 UA4A Rioome pool niRNA r I Tpwmoterf pBAD T7*RNAP core pConstlt LuxR plu RFP GFP J7r SigM CGGenn Figure 2-7: Design 3 This figure shows another implementation of the general system. Here, T7* RNAP is produced under the action of an inducible promoter. Furthermore, GFP and T7* sigma are produced on the same mRNA but expressed separately. The T7* sigma then binds to the T7* RNAP and is now a functional polymerase that can recognize and transcribe the sRNA from the pCGG promoter. This, the sigma subunit acts as a transcription factor here. The GFP construct has been changed. The promoter here is a T7 promoter. When transformed into a strain containing the T7 RNAP, which can be induced by IPTG, the promoter is turned on. Also, the CGG sigma subunit from the Voigt paper 1401 is coexpressed on the same mRNA as GFP. Therefore, the two are coregulated together. Moreover, we added a new piece here. The T7 core protein form the Voigt paper is expressed under the control of pBAD promoter. This promoter can be induced via the addition of atc to the media. The sRNA is now under the control of the modified pCCG promoter. Transcription from this promoter is initiated when the sigma subunit binds to the promoter. Therefore, the sigma subunit acts as an activator of the pCGG promoter in this design. 44 Design 3 Model and Simulations The model and model parameters are based on the Hwa[l] and Voigt [401 papers. ds r -- = a - 6s - kms dt rn +Kd, (2.8) dm =Gu-6m-kms dt (2.9) dP= Rm - yP dt (2.10) du= Rm - yo-+ kdr - kaUc dt (2.11) dc c= Gv - yc + kdr- ka-c dt (2.12) -- dr d-=-r -kdr +kaUC (2.13) Where: s is the concentration of sRNA P is the concentration of GFP o- is the concentration of the sigma subunit c is the concentration of core CGG protein r is the concentration of CGG RNAP m is the concentration of mRNA Kd, is the binding constant of CGG RNAP to the promoter pCGG k is the decay rate of the sRNA/mRNA complex ka is the association rate constant of the sigma subunit with the core protein kd is the dissociation rate constant of the sigma subunit with the core protein u is the constant rate of mRNA production v is the constant rate of core unit production 45 n is the hill coefficient of the binding of the CGG RNAP R is a parameter that is related to ribosome availability 6 is the decay rate of the mRNA and sRNA -y is the decay rate of the proteins a is the difference in promoter strength between the sRNA promoter and the mRNA promoter Design 3 Model Results After simulating the model and varying the parameters, we did not achieve the result that we had been looking for. Moreover, when we look at the steady state values of r and then plug that back into the equation for the rate of change of s, we see that the G term appears both in the numerator and denominator. Therefore, for high gain situation, the dynamics of s saturate and the effects of feedback are not present. Therefore, we decided to abandon this design since it does not satisfy our main objective. 2.3 Final Design Choice What we concluded from the previous analysis was the following:For GFP detection, we require high levels of GFP which will also mean high concentrations of activator, if they were fused together. This might be a problem since it can saturate the sRNA promoter and break the feedback loop. Also, as shown in the theoretical analysis, using a split activator would not work either. Therefore, we will go with the activator and reporter being expressed on the same mRNA but are translated separately, since that overcomes the fusion problem as the RBS of the activator can be made weaker. 46 2.4 2.4.1 Final Design Mechanistic Model Reactions RFP Production . P1 I m1 Figure 2-8: RFP production Inducer I induces the expression from the promoter (shown in blue) and cause the transcription of the RFP gene into mRNA ml. Then, ml is translated into a protein P1. Reactions: mI m1 - mi )0> mi +y k- di d i -114>mi +y+P, Pi A0 GFP Production Reactions: u M2 ) m2 m2 +y kj d2 d 2 _24 m 2 + y + P 2 47 S S~---- plwc>, sRNA DP A m2 LuxR GFP ptet Figure 2-9: GFP production from the promoter upstream of the two genes and contains transcribed is m2 mRNA The the sequences for the two genes. m2 is translated into two distinct proteins P2 (GFP) and A (transcription factor). A binds to the sRNA promoter and causes the sRNA expression (s). s then binds to m2 and causes its degradation. m 2 + y ,k3 k- d3 ' d3 m2 + y -!2 +A P2A nA s - s ->2q s + m2 k -+q Conservation law for Ribosomes Y=YT-ys=y + d, + d 2 +d 3 Differential Equations The species present in this model are: s, Mi1 , M 2 , A, P1 ,P 2, di, d 2 ,d 3 y- 48 ds _aG --- =A dt dm1 A" + - -ys- km 2s An + KA (2.14) + kM- 6m1 - ktmiy + kid1 + 7 1d1 - dt (2.15) dP = rd 1 di - AP 1 (2.16) d, = k'Tmy - k-di -wridi (2.17) dt dm 2 Gu - 3m 2 - km 2 s k+Tm 2 y + kjd 2 +7r 2c 2 - dt= dP2 dt dit dd 3 dt kTm 2y + k-d 3 +7w 7r2d3 - d3 (2.18) (2.20) AA 72d2 (2.21) y - kd3 - 7r 2d3 (2.22) = kjm 2y - kjcd 2 = km 2 2 (2.19) 7r2d 2 - AP2 dA dt dd2 - - Simplifying assumptions Since the ribosome binding to mRNA occurs at a much faster rate than protein translation, we can assume that kt and k- >> 7r, 6 a and that di and d2 and di are at steady state. Therefore, this gives us the following: m7 Y K1 49 ,M (2.23) m 2y d2 = (2.24) K K2 m2 y d_ (2.25) Substituting these equations into the conservation law, we get: y Y = Y M+ 2+ 2 (2.26) since K3 is very large New simplified differential equations aGAn ds dt 7s An + KA dm1 _ __I"__ dt m1 K, * Y + n21+ ", K K 1 CU 2 - - - -API km 2 S 2.4.2 K 2 + K1 Y M2 - 7 =2 2 K3 * dA dt Y M2 (2.29) 2 dt dP2 dt (2.27) (2.28) Im + K, dP dt dM 2 km 2 s - + M2 K2 (2.30) - AP2 (2.31) - AA (2.32) Parameters Parameters were based on the physics-based model in Jimenez et al. [281 and Bion. umbers [41] 50 Parameter KA n k m K1 6 71 K1 K2 A U1 U2 7r 2 Y K3 a Value 100 nM 7 hr-1 2 100 nM- 1 hr-' 2 100 nM 10 hr-1 30 hr-1 100nM 100nM 1 hr-1 300 nM hr-1 v nM hr- 1 (this is variable to keep the GFP at the same steady state) 30 hr-1 1300 nM 5000 nM 100 Table 2.1: Table 2.1: Mechanistic Model Parameters 2.4.3 Model Results 51 cuus . 3 0. 05 0. 04 . - Low G "" G= 1 20 0.03 M1 15 10 0.02 5 0. 01 0 0 25 6 10 hours . . 23 15 -r 5 10 16 0 20 5 hours 70M0 . 01 0 G 20'" 10 hours 140 1 15 """"G 120 2 =1 100 1-I ---- %decrease in P2 is 19.3525 Is 50 10 5 0 5 10 15 20 2000 40 1000 20 0 5 hours 10 hours 15 20 0 6 10 hours IS 15 20 Figure 2-10: Figure 2.10: Low G This figure shows the simulation results of the full mechanistic model for low gain. Top row: Left: sRNA (s) increases quickly and reaches its steady state value. It remains unchanged even after RFP induction. Middle: ml (RFP mRNA) is zero before induction and increases to its steady state values after induction at 6h. Right: P1 increases after ml is produced. Bottom row: Left: GFP mRNA (m2) establishes its steady state early on and remains unchanged. Middle: P2 (GFP) had already reached steady state before RFP induction. After RFP is induced, GFP level decreases and establishes a new steady state that is 19 percent lower than before. Right: The transcription factor A behaves similarly to P2. 52 Medium G 3- 5 -G= 1001 J" 25 G= 100 G= 100 20' 4 Am 1U 15 .3 19 10 2 5 0 1 0 01 -51 0 5 10 hours 15 2 5 10 hours 15 5 20 7000 10 hours 140 "--G !L1OJ S%dcrease 10 in P2 is 13.7774 G S 120 20 15 iml 10 42000 60 i0o 40 111J 0 s 10 hours 15 20 20 0 5 10 hours 15 20 0 0 5 0 10 hours 1 15 Figure 2-11: Figure 2.11: Medium G This figure shows the simulation results of the full mechanistic model for medium gain. Top row: Left: sRNA (s) increases quickly and reaches its steady state value. It decreases slightly after RFP induction. Middle: ml (RFP mRNA) is zero before induction and increases to its steady state values after induction at 6h. Right: P1 increases after ml is produced. Bottom row: Left: GFP mRNA (m2) establishes its steady state early on and remains unchanged. There is a slight overshoot at the beginning. Middle: P2 (GFP) had already reached steady state before RFP induction. After RFP is induced, GFP level decreases and establishes a new steady state that is 13 percent lower than before. Right: The transcription factor A behaves similarly to P2. 53 20 High G OIUU 30 mn. e -G 60 = 1903 |""""G = 25 itOD 20 40 15 30 10 20 5 10 0 5 0 20 15 000, E 2000 0 -5 0 5 hours 10 hours 15 -- "G= 1000) 0 20 10000 . 150 10 """G =10[I1 | 5 10 hours 15 2 3 200 "-G=10I UWM 150 00 < 100 0-- 40D %decrease in P2 is 7.206] so 0 5 10 hours 15 20 0 5 10 hours 15 20 0i0 5 10 15 hours Figure 2-12: Figure 2.12: High G This figure shows the simulation results of the full mechanistic model for high gain. Top row: Left: sRNA (s) increases and reaches its steady state value before RFP induction. It then decreases and establishes a new steady state after RFP induction. Middle: ml (RFP mRNA) is zero before induction and increases to its steady state values after induction at 6h. Right: P1 increases after ml is produced. Bottom row: Left: GFP mRNA (m2) establishes its steady state early on and remains unchanged. There is an overshoot at the beginning. Middle: P2 (GFP) had already reached steady state before RFP induction. After RFP is induced, GFP level decreases and establishes a new steady state that is 7 percent lower than before. Right: The transcription factor A behaves similarly to P2. 54 1 20 Results As shown in Figures 2-10, 2-11 and 2-12, there is a decrease in the GFP level following induction of RFP production. However, this decrease in GFP can be made smaller (from 20 percent to 7 percent) as the gain of the feedback loop is made larger. In the physical implementation, the gain of the sRNA can be controlled by the concentration of AHL that is added. Higher concentrations cause higher expression of sRNA and hence an increase in the gain. Furthermore, as the sRNA figure show, when the gain is small, the concentration of sRNA is low and sRNA cannot respond to the changes in ribosome availability as a result of RFP induction. However, at high gain, when RFP is induced, the sRNA level decreases. This decreases the sRNA induced degradation of m2 and makes up for the decrease in P2 as a result of the decrease in ribosome availability. 2.5 Conclusions In this section, we propose a general solution to the problem of ribosome limitations. This solution involves the integration of a high-gain feedback loop via an sRNA-intermediate. From this starting point, we explored three different physical implementations. We modeled each of those systems and simulated them in order to get a better idea of whether they would work or not. Based on those simulations, we were able to decide on one design that could work. Next, we made a full mechanistic model that describes the system and explored how this might solve the problem at hand. Once that was done, the system was ready for physical implementation. 55 56 Chapter 3 Initial Testing and Test Constructs Before building the final system constructs, we decided to build intermediate constructs that serve two main purposes. The first is that they would be an intermediate assembly step that would make it easier to assemble the final construct; this is in contrast to building the final system in one go. Second, those intermediate constructs could serve as test constructs that we can use to study and characterize the system elements. This is a crucial step to make sure that all the elements work as they should and to provide the opportunity to modify those which do not fit within the model parameters. 3.1 Methods All constructs were prepared using gibson assembly. Once the plasmids were made and sequence-verified, they were transformed into MG1655PRO cells (obtained from the Collins lab at MIT) or NEB 5alpha. The colonies were streaked on LB-agar plates containing the relevant antibiotics and then individual colonies were inoculated into M9 media with either glycerol or glucose. The cultures were grown for 8-10 hours until they reach exponential phase. Then, they were diluted and induced with the relevant inducers. Then, they were inoculated in a 24-well plate in M9 media and allowed to grow for 18 hours in a Synergy MX (Biotek, Winooski, VT) plate reader. 57 The plate reader measured both the OD as well as the fluorescence. In order to analyze the results, background values were subtracted from the OD and fluorescence measurements. Each graph presented in this section is the mean of the results of 2-3 colonies taken only in the time-frame where the cells are in exponential phase. The test constructs are the following: 3.2 3.2.1 GL construct Description Figure 3-1: GL plasmid The construct consists of the following: 1. Ptet constitutive promoter. This promoter is regulated by the tetR protein. However, in a normal strain, the endogenous tetR level will not be high enough to inhibit the promoter; therefore, it can be considered as constitutive. Moreover, this provides a second degree of regulation since the plasmid can be transformed into a strain that expressed high levels of tetR (for example MG1655PRO where tetR under the action of a strong promoter is integrated into the genome). Under those circumstances, the amount of promoter activity can be regulated by the amount of atc added. Atc binds to tetR causing it to unbind from the promoter (cite). Therefore, atc acts as an activator to the promoter. 2. sRNA cognate sequence: The sRNA cognate sequence is a sequence of 15bp that is complementary to the sRNA barcode. This sequence allows the sRNA to bind to the mRNA. Consequently, the complex binds Hfq and is degraded. 58 3. RBS: ribosome binding site: this is the region on the mRNA where the ribosome binds to initiate translation of the sequence. The first RBS corresponds to the GFP and allows the translation of GFP from the mRNA. 4. GFP: green fluorescent protein sequence: This is the sequence that expressed the GFP. 5. RBS2: this is the second ribosome binding site that corresponds to the luxR protein. Since there is no terminator sequence between the two genes, the RNA polymerase will transcribe the whole region as one mRNA molecule. However, each gene will be expressed separately since each has its own RBS and a stop codon that terminates translation at that sequence. Moreover, this RBS is much weaker than the GFP RBS. This was a necessary requirement from the model since the LuxR levels must not increase so much so as to saturate the plux promoter. If that happens, the feedback loop would be broken. 6. luxR gene: the luxR gene sequence encodes for the luxR gene. 7. Term: the terminator sequence ends transcription and causes the RNA polymerase to fall off the plasmid. This construct includes the GFP gene as well as the luxR gene. Moreover, it also includes the sRNA cognate sequence and the constitutive promoter ptet as well as the final transcription terminator. The sequence was built using two gBlocks templates from IDT and then assembled using Gibson Assembly. It was assembled first on pACYC and then on pZE21 142]. This construct is the main test construct that was transformed with each of the other ones in order to do the testing. 3.2.2 Characterizing the construct When I first built this construct, I realized that GFP expression from this construct wasn't as high as other constructs that I had built before. To characterize what might be happening, there were three hypothesis: 59 1. The 5' sequence preceding GFP (especially the sRNA sequence) interfered with its expression 2. The 3' sequence following GFP (the luxR gene and it's proximity to GFP) interfered with its expression 3. The concomitant expression of luxR influenced GFP expression by taking resources away from it In order to test those hypotheses, I built two additional constructs with slight modifications to the GL construct: GN construct This construct is similar to the GL construct but lacks the LuxR gene as well as the LuxR RBS. Therefore, the terminator immediately follows the GFP gene. This construct tests the first hypothesis. If removing the LuxR gene causes no changes in GFP production, then it is probably the 5' sequence that is causing the trouble. GS construct This construct is similar to the GL construct but has a 30 bp spacer sequence between the GFP gene and the LuxR RBS. The spacer was included to mimic intercystronic sequences found in operons that express polycistronic genes. This construct was used to test the second hypothesis and to check whether there are proximity issues. If GFP expression is increased following this modification, this means that proximity effects are causing an issue. Results The constructs were all built on pACYC medium copy plasmid and transformed into NEB 5alpha cells. As figure 3-2 shows, the construct without the LuxR gene (GN) had higher fluorescence than the other constructs. However, the construct with the spacer (GS) had a very similar fluorescence level as the original GL construct. 60 Therefore, these results indicate that it is not the genetic context that is affecting the GFP expression level. It is most probably the concomitant expression of LuxR. These results are favorable since GFP expression is still relatively high for a low copy plasmid and they also indicate that luxR is being expressed. Fluoresence/OD vs Time 150000- - GL - GS - GN - 50M0 00 100 300 200 Time (minutes) 400 500 Figure 3-2: Fluorescence Results of GN, GL, GS These graphs show fluorescence measurements divided by the OD in exponential phase for the three constructs. GS and GL have very similar fluorescence profiles whereas GN has a much higher fluorescence level. 3.2.3 Testing the construct The final construct would be built on a high copy plasmid. Therefore, we transferred the GL construct onto a high copy pZE plasmid and transformed the plasmid into MG1655PRO cells (cite). Then, we checked whether tetR inhibition worked and whether induction by atc worked. Therefore, we grew cells in a 24-well plate without inducer and with 30nM or 100nM of atc. The results are shown in Figure 3-3. The fluorescence level increases significantly with the addition of atc. Furthermore, the fluorescence level increases as the atc level increases. Therefore, GFP expression is well-repressed in the MG1655PRO cells and can be restored by the addition of atc. 61 pZE GL MG1655PRO GFPIOD 250000 - no atc 30nM atc - 1OOnM atc - 200000 150000 100000 50000. 00 10 5 15 Time (h) Figure 3-3: Figure 4.2: fluorescence Results of GL in pZE with different levels of atc GL was tested at 3 atc induction levels. With no atc, there is minimal fluorescence. When atc is added, the fluorescence is high and it is higher at 100nM than at 30nM. 3.3 RV2 construct Figure 3-4: RV2 plasmid This construct includes the RFP gene under the action of plux promoter. It was built by two rounds of PCR followed by gibson assembly. The first round of PCR added the plux promoter to the RFP gene. The second round added the homology sites to the pACYC plasmid. Finally, Gibson Assembly was used to incorporate the plux-RFP into the plasmid. The main goal of this construct was to test the interaction between plux and luxR in this novel context using a measurable fluorescent reporter (RFP). In the final construct, plux produces sRNA which cannot be measured directly. Therefore, we transformed R-test with GL into MG1655PRO cells that contain constitutively expressed tetR protein. Then, we induced GFP-luxR by adding 62 atc. Next, we added AHL that would bind with luxR to initiate transcription from the plux promoter and the result can be read from the RFP fluorescence. Results After transforming the two plamids into MG1655PRO, we tested the cells under 4 different conditions and measured GFP and RFP fluorescence: 1. no inducers: We do not expect any fluorescence here. 2. 100nM atc: We expect only GFP fluorescence here. 3. lOOOnM AHL: We expect no fluorescence here. 4. 100nM atc and 10OOnM AHL: We expect both GFP and RFP fluorescence. Figure 3-5 shows the GFP fluorescence. As expected, we only observed GFP fluorescence in the presence of atc. GL+RV2 GFP/OD 600000- - no - atc AHL - 0 - fl.I atc+AHL 5 15 10 Time (h) 20 Figure 3-5: GFP fluorescence of GL+RV2 plasmids GL and RV2 plasmids are cotransformed into MG1655PRO cells and are tested with different inducers. GFP fluorescence is high only in the presence of atc. Figure 3-6 shows the RFP fluorescence. 63 GL+RV2 RFPIOD 2500 - no atc 2=-- AHL 1500 - atc+AHL 1000 14 18 16 20 Time (h) Figure 3-6: RFP fluorescence of GL+RV2 plasmids GL and RV2 plasmids are cotransformed into MG1655PRO cells and are tested with different inducers. RFP fluorescence is highest in the presence of both atc and AHL. While we should expect fluorescence only in the presence of both inducers, the data shows that although the level is highest when the 2 inducers are present, the presence of a single inducer also caused a slight increase in the RFP level. The most probable explanation for this is that there is some leaky low expression. Furthermore, it seemed that the RFP level was rising near the end of the experiment for the well that contained both inducers. Therefore, diluted a small volume of that culture in fresh media with both inducers and allowed the cells to grow for more time in exponential phase. Figure 3-7 shows the RFP fluorescence data for that culture before and after dilution. The data shows that the RFP level continues to rise and then stabilizes at a high level. In summary, this data supports the hypothesis that functional LuxR is produced with GFP when atc is added and that the addition of AHL activates the plux promoter. The reason why we had to grow the cells for a longer period of time to observe a significant increase in RFP production is that for RFP to be produced, there are multiple events that need to take place: 1. atc binds tetR and frees the ptet promoter 64 GL+RV2 extended exponential phase 5000 4000, - before after 3000- 4M . 3M 200. 0 10 30 20 40 Time (h) Figure 3-7: RFP fluorescence of GL+RV2 plasmids in extended exponential growth GL+RV2 culture is diluted and allowed to grow for a longer period of time with the inducers. The level of RFP continues to increase. 2. GFP and LuxR are expressed from the activates ptet promoter 3. LuxR binds AHL and the bound complex activated the plux promoter 4. RFP is produced. It also takes time for it to fold and become functional. 3.4 S-test construct kS2 Figure 3-8: Figure 4.3: S-test plasmid This construct includes the sRNA-scaffold under the action of plac promoter. It was built by assembling the plac promoter and sRNA as two pieces and inserting them into the pACYC plasmid using Gibson Assembly. 65 This will be used to test the sRNA inhibition of GFP mRNA. This construct was transformed with the GL construct into MG1655PRO cells. Then, we induced GFP expression by adding ate. Next, we measured the change in GFP fluorescence at various levels of the inducer IPTG. IPTG here would induce the expression of the sRNA. Therefore, ultimately, this would test the sRNA inhibition. Results When I grew those cells, the results were not very clear as the cells were growing very fast and reached stationary fast in a very short time period. However, given the results of the next section, there was no need to test this again. 3.5 SV2 construct rl R0062 i77 sRN = Figure 3-9: SV2 plasmid 66 This construct includes the sRNA sequence under the action of plux promoter. Similarly to R-test, we added the promoter after a first round of PCR and then the homology sites after the second round. Finally, we used Gibson Assembly to assemble the final construct. The main goal of this construct was to test the interaction between the sRNA and the GFP construct. We transformed this construct with the GL construct and then induced the GFP using atc. Then, we added AHL and tested the steady state level of GFP in the presence of the negative feedback loop. Results MG1655PRO cells were cotransformed with the GL and SV2 plasmids. Different levels of AHL were added to induce the sRNA expression from the plux promoter. As figure 3-10 shows, the addition of AHL results in a decrease in the level of GFP expressed. Furthermore, the more AHL is added, the lower the expression level of GFP is, until it saturates at 100nM of AHL. This supports the hypothesis that the luxR produced along with GFP induces the expression of sRNA that then causes a decrease in the GFP expression. This means that the negative feedback loop is working. Moreover, the level of GFP was decreased by more than half even though the sRNA was on a low copy plasmid. This means that the inhibition is strong and might suggest that the sRNA does not need to be on a high copy plasmid. 3.6 Conclusions In this chapter, we explored, designed and tested initial intermediate constructs. Through those constructs, we were able to verify that every system element is functioning the way that it is supposed to. Additionally, we were able to build the feedback loop and optimize it such that its gain is tunable. Next, we would begin the process of building the final system. 67 - -- - -_t,. - - - - - - . -,- - LL - - - - --- - - - - - -- - - -- - -11 - 11111111M - - - - - GL+SV2 GFPIOD 300=- no AHL - 100000- - AHL 2 AHL 4 AHL 10 AHL 100 AHL 1000 ad 0 2 6 4 am* 8 Figure 3-10: Effect of AHL induction of the sRNA on GFP fluorescence GL was cotransformed with SV2 and they were all induced with atc. Then, they were induced with different levels of AHL to activate the feedback loop. As the graphs show, the higher AHL concentrations, the lower is the steady state GFP level. 68 -.. .. .... ......... - 4- Chapter 4 Constructs to be built The final system will be built on two plasmids. Initially, we had planned to put the sRNA on the high copy plasmid to increase the gain. However, given the test results and the fact that the sRNA works very well on pACYC, we decided to keep it there. 4.1 sRNA plasmid The sRNA cassette consists of the following: 1. Plux promoter: the plux promoter is constitutively off. It is activated by the action of luxR fused with AHL (cite). 2. sRNA barcode: the sRNA barcode is a sequence of 15 base pairs that binds to its homologue on the target mRNA strand. 3. sRNA scaffold: the sRNA scaffold acts as both a transcriptional terminator as well as a binding sequence for the Hfq protein. Moreover, this construct was built on a medium copy plasmid such as pACYC184 which has the p15A origin of replication and also has the chloramphenicol resistance gene. This construct is the same as the SV2 construct. 69 4.2 GFP-RFP plasmid This plasmid is designed to be as similar to the main plasmid used in Jimenez et al.'s experiments. Building on previous constructs, we combined the GL construct with an inducible RFP expression cassette. We used the BBa-R04450 construct from the BioBricks directory. This construct contains the RFP gene under the control of a plac promoter. The final design is shown in figure 4-1. It was built by integrating the plac-RFP-terminator piece from R04450 into the GL construct and assembling using Gibson Assembly. Term RFP plac T Ptet Rpromoter RBS sRNA cognate GFP RBS AraC Figure 4-1: GFP-RFP plasmid 4.3 4.3.1 Testing and Results Effects of RFP induction on GFP In order to test this construct, we would transform MG1655PRO cells with this construct. The MG1655PRO contain both LacI and TetR repressors constitutively expressed on the genome. Therefore, initially both genes will be repressed. We can turn on the GFP expression by the addition of ate to the media. Then, we can measure the steady state fluoresence level. Next, we can induce RFP by adding IPTG and seeing the effect that that has on the GFP level. We would expect this to decrease. The effect of adding the negative feedback loop In order to add the feedback loop, we can co-transform the cells with both the GFPRFP plasmid as well as the sRNA plasmid. We added ate to express GFP. Then, we 70 Term can test the cells under different conditions: 1. only atc added. This would turn on the GFP circuit. 2. atc and AHL added: This would turn on the feedback loop. 3. atc and IPTG added: This would turn on the RFP. 4. atc, IPTG and AHL added: This would turn on the feedback loop as well as the RFP circuit. 71 72 Chapter 5 Resource Localization in Bacteria 5.1 Problem Statement and Hypothesis Another theory that we tested was whether the heterologous synthetic circuits share the same pool of resources with the genomic genes or whether the synthetic circuits were competing amongst each other for a different pool of resources that was spatially separated from those being used by the chromosome. For that purpose, we decided to integrate a constitutively expressed GFP into the genome of the E.coli. GFP here was under the action of ptet promoter. On the other hand, RFP would be constitutively produced on a plasmid along with gapA protein that is induced via the action of a small molecule inducer. gapA protein is an abundant protein that is present in high quantities in E.coli and was used as a control protein in Jimenez et al. Once GFP is integrated into the genome, we would transform the cells with the RFP-GAPA plasmid. Then, we would induce gapA and record the changes in fluorescence of GFP and RFP. We would expect RFP fluorescence to decrease since gapA is now competing with the resources that RFP needs. What we would be really interested in is whether GFP levels would change. If GFP levels decrease, that would mean that heterologous circuits that are present on plasmids compete with the genomic genes for the same pool of resources. However, if GFP levels remain unchanged, this would support our initial hypothesis that external circuits and the genomic ones rely on separate pools of cellular resources. 73 5.2 Chromosomal Integration of GFP General scheme In order to ensure consistency with the results, we decided to integrate GFP into the chromosome of the DIAL J strain. This was the same strain that was used in Jimenez et al.'s experiments. The DIAL strains were made by integrating a repA expression cassette into the genome. This casette also included the kanamycin resistance gene. Therefore, the most straightforward way to integrate GFP into the genome was to rebuild the DIAL strain with GFP included in that cassette. This would preclude the need to find and optimize a new integration site within the genome. DNA piece In order to build the new DIAL strains with GFP integrated in the genome, we had to first prepare the linear piece of DNA that would be integrated into the genome. This linear piece would be cotransfomed into the bacterial cells along with pKD40 following the protocol of Datsenko and Wanner. The linear piece of DNA is shown in figure 5-1. This is a modified version of the linear DNA piece that was used to create the DIAL strain. The main difference is the presence of the GFP gene under the control of a constitutive ptet promoter (ptet is inhibited by tetR, but the bacterial strains used did not contain the tetR gene). The GFP gene was located before the 3' homologous sequence. We added terminators on both sides of the gene to prevent cross-reading. Next, we decided to build this approximately 4000 base pair DNA segment on a plasmid using standard cloning techniques. This has several advantages. First, the DNA piece comes from several different origins and cloning would be the easiest method to stitch them together. Second, since this is a relatively long piece, it would be difficult to PCR amplify it using traditional PCR techniques. On the contrary, if it were present on a plasmid, we would be able to grow large quantities of the plasmid in cells and then digest the piece out of the plamid using restriction enzymes. The DNA piece came from 2 different origins. 74 The segments repA gene, the -IR Figre NA -1:Linar iec n that srahwn =nte>re kanamycin resistance gene as well as the homologous segments were all PCR'ed directly from the genome of the J DIAL strain. On the other hand, the GFP gene as well as the terminators were assembled as a gBlock from IDTDNA. The overall assembly of the final plasmid consisted of 5 pieces as well as the pACYC vector (low copy cloramphenicol resistance). The pieces were assembled together using Gibson assembly and then plated on plates containing both kanamycin and cloramphenicol. This would ensure that only the cells containing the correct plasmid with both resistance genes were the ones growing. After many rounds of optimization, the cloning resulted in many colonies. We selected a few and sequence verified the final product. Next, we retransformed the sequence verified plasmid into bacterial cells and grew them overnight. Using the zymo DNA prep kit, we isolated a large concentration of the plasmid. Next, we used the restriction enzymes bstAPI and ScaI to isolate the linear DNA piece that would be inserted into the genome. BstAPI is most efficient at 60C. Therefore, we incubated the digestion reaction at 37C for 5 hours first and then at 60C for 5 more hours. We set up two reaction tubes to isolate a larger amount of DNA. After the digestion, we ran the reactions on a 1 percent agarose gel to visualize the band. As seen in figure 5-2, the digestion reactions resulted in a clear 4kb band, which is most likely the linear piece of DNA that we sought. Genomic Integration and Verification Once the linear DNA piece was ready, we prepared elctrocompetent MC1O61 cells which contained the pKD46 plasmid. These cells are the precursors of the DIAL 75 Figure 5-2: Gel Electophoresis results of digestion reaction The plasmid containing the linear DNA piece was digested in two reactions and the digestion reactions were run on a gel. The ladder is a 1kb ladder. The picture shows the 4kb linear piece that we expected. strains and contained no antibiotic resistance genes. After the cells were ready, we electroporated them with the linear DNA piece and grew them at 37C for 1 hour before plating them on kanamycin containing plates. The 37C incubation step endures that the pkD46 plasmid is cured since it is temperature sensitive. This also means that linear DNA pieces are no longer stable. Therefore, the colonies that would grow on the kanamycin plates were the ones that had successfully integrated the piece (which contains the kanamycin resistance gene) into the genome. The electroporation experiment resulted in numerous colonies growing on the plate. In order to further verify that the colonies growing had successfully integrated the DNA piece into their genome and they were not false positives, we first selected 10 colonies. We suspended a swab of each colony in water and then plated lOuL of each suspension onto three different plates: 1. Chloramphenicol containing plates: If the colony grew on this plate, this would mean that it somehow got the original plasmid that was used to build the DNA piece. This would mean it is a false positive. 2. Ampicillin containing plates: pkD46 is ampicillin resistant. If the colonies grew on this plate, this would mean that they still have the pkD46 plasmid and could 76 be false positives. 3. Kanamycin containing plates: All colonies should grow on this plate. If a colony did not grow on this plate, this would mean that the linear DNA piece was not degraded yet in the original colony. The results are visually illustrated in figure 5-3. As the figure shows, no colonies grew on the Ampicillin containing plates, which means that all the colonies had lost the pkD46 plasmid. Furthermore, all the colonies grew on the kanamycin containing plates. On the other hand, 4 of the colonies grew on the chloramphenicol containing plates. This means that those were the false positives and the ones that didn't grow are more likely to have correctly integrated the DNA piece. Kanamycin Chioramphenicol Apcli Figure 5-3: Electroporation colonies plated on different antibiotic containing plates The colonies resulting from the electroporation were diluted in water and then a few uL were plated on plates containing different antibiotics. The correct ones would only grow on kanamycin containing plates. Finally, we used PCR to verify the DNA piece had been integrated in the correct genomic locus. We designed two sets of primers. The first set included two primers. One that annealed in the genome upstream of the 5' end of the locus and another that annealed on the inserted repA gene. Therefore, we would only obtain a product if the piece was integrated at that location. As a control, we also used this set of primers to PCR the genome of the J DIAL strain which contained the same 5' end and should yield the same PCR product. Furthermore, we also designed another set of primers composed of two primers: one that anneals in the genome downstream of 77 the 3' end of the insert and another that anneals on the GFP gene. We would only expect a PCR product if the piece was correctly inserted in that locus. Finally, the size of the band is another indicator of whether it is a false positive or a true result. The primers were designed to yield 800bp bands. Figure 5-4 shows the result of the PCR after the reactions were run on a gel. Lane A contains the control reaction and lane B contains the 5' end reaction. We would expect the two bands to be the same and this is what we saw. Furthermore, we got a band of the correct size for the 3' end reaction. Finally, we purified the bands and sequenced them. The sequencing results were the final confirmation that the DNA piece was correctly inserted and that GFP was now in the genome. Figure 5-4: Gel electrophoresis results of the PCR verification reactions The genome of the cells was PCR'ed with different primers to confirm integration. The PCR reactions were run on a gel. The ladder is a 1kb ladder. A: Control reaction with original DIAL strains. B: 5' end PCR reaction of the cells with GFP integrated into the genome. C: 3' end PCR reaction of the cells with GFP integrated into the genome. 5.3 gapA-RFP Now that GFP was correctly inserted into the genome, we next sought to build the plasmid that would be transformed into those cells to test the resource separation hypothesis. Jimenez et al. had already built a construct with gapA under the control of plux and GFP constitutively expressed. We modified this by replacing GFP with RFP as shown in Figure 5-5. We assembled the construct using Gibson assembly where the gapA-GFP was split into 4 pieces and then stitched together with RFP. After the plasmid was prepared and sequence verified, we transformed it into the DIAL cells with GFP in the genome. 78 Pka 9W4 Mac Pket ADA ~ Pka MAkm to RAc PMs g94 Figure 5-5: gapA-RFP plasmid Original construct is shown on the left and the modified version on the right 5.3.1 Testing and Results Methods In testing those constructs, we followed the same protocol followed by Jimenez et al and that is the following (taken from Jimenez et al.): Pre-startingcultures coming from isolated colonies in LB plates were grown in 24-well plates using 1 ml of M9 minimal medium supplemented with 0.4 percent glucose, 0.2 percent casamino acids, 1 mM thiamine, ampicillin (100 ug/ml) and kanamycin (50 ug/ml). Cells were incubated for 7-10 h at 30C and 100 rpm in an orbital shaker. When they reached the mid-log phase, they were diluted into 1 ml of the same M9 fresh media and incubated under the same conditions. 3-4 hours after dilution, during exponential growth, the cultures were induced with AHL (Cayman Chemical, Ann Arbor, MI) at a final concentration of 1, 2, 4, 10 and 1000 nM, and cells were grown for 8 additionalhours until they reached the steady state of protein production, still in the exponential phase of growth. For single-cell analysis, 5-10 ul aliquots were taken from each well every 1 h. The volume of the culture was kept constant replenishing with the same volume of fresh medium. Right after removal, the aliquots were diluted in 100 ul of water and analyzed in a BD Accuri C6 flow cytometer (BD Biosciences, San Jose, CA). The instrument is equipped with blue (488 nm) and yellow-green lasers (552 nm) for GFP and RFP, respectively. Emission was detected using a 525/50 filter for GFP and a 610/20 for RFP. Flow rates were always kept below 1000 events/sec and 30,000 to 100,000 events were analyzed in each read. To track the behavior of the whole population present in each well, the same plate was monitored every hour for absorbance (600 nm) in a Synergy MX (Biotek, Winooski, VT) plate reader. Results were graphed in Graph Pad Prism (Graph Pad Software Inc.) 79 Results The cells were cultured as described in the Methods section. They were grown overnight from single colonies on a plate and then diluted in the morning and induced when in exponential phase. We used flow cytometry to measure the fluorescence of the cells. The cells were induced with different levels of AHL. Figure 5-6 shows the results at steady state. Each bar represents a different induction level. From left to right, the concentration of AHL is OnM, 2nM, 4nM, 10nM, 100nM, and 10OOnM. As expected, as gapA production is induced by AHL, RFP expression decreases. Moreover, the more gapA is expressed (i.e. the higher the concentration of AHL), the lower the level of RFP. However, the level of GFP remained unchanged. This supports the hypothesis that the resources could be spatially separated. In other words, the plasmid genes compete for a different pool of resources than that available to the genomic genes. Furthermore, this would mean that there could exist a spatial separation of those cellular resources. ........................... ................. Figure 5-6: GFP and RFP fluorescence of gapA-RFP plasmid with chromosomal GFP Cells with chromosomal GFP were transformed with gapA-RFP and tested using different AHL concentrations. At steady state, GFP fluorescence remains unchanged while RFP decreases for higher concentrations of AHL. 5.4 TetR construct When measuring the GFP fluorescence of the DIAL cells where GFP was integrated into the genome, we noticed that the fluorescence level was really low. While it was 80 definitely above the threshold, we wanted to make sure that the data is not just noise. Furthermore, since the GFP gene was under the control of a ptet promoter, we decided to build a construct that expressed the TetR repressor. The construct contained the TetR gene under the control of a constitutive promoter and on a ColE2 minimal origin plasmid that was ampicillin and kanamycin resistant (this is the same plasmid used for the MBP1.0 and gapA-RFP constructs). The TetR gene was obtained from the iGem Registry and the promoter was obtained from the list of constitutive promoters on the Biobricks website. This construct was built using Gibson assembly to assemble the TetR gene with a terminator. The promoter was attached as an overhang on the primer used to amplify TetR. Once the construct was built and sequence-verified, we transformed it into the J DIAL cells that had GFP on the genome. 5.4.1 Testing and Results We tested the cells under different conditions. We expected that the addition of TetR would cause a decrease in the GFP fluorescence to the basal level. Furthermore, the subsequent addition of atc should cause an increase in that level. Since the TetR protein in our setup was in large excess to the ptet promoter, we had to add a large concentrations of atc to see an effect. The conditions under which we tested the cells are the following: 1. no atc 2. 500nM atc 3. 10OOnM atc 4. 150nM atc added every hour 5. 300nM atc added every hour 6. 600nM atc added every hour 81 The results are shown in figure 5-7. With no atc, the level of GFP is the same as the background level and does not change. Moreover, ate causes the an increase in the level of GFP. The increase is most evident when 600nM of ate is added every hour. This is probably due to the high concentratoin of TetR and very low concentration of the ptet promoter (one copy per cell). In summary, these results show that the fluorescence data we were measuring previously results from GFP expression from the ptet promoter and is not just noise. The results also show that the range of fluorescence is still high enough to detect changes in that level. pJY002 - no aTc -o- 500nM aTc , 11OOnM aTc + 150nMihr T L +-300nMihr - 600nMIhr 10 0 2 3 4 5 tkn.ew Figure 5-7: Results of ate inducing tetR inhibited chromosomally integrated GFP Cells with chromosomal GFP were transformed with the plasmid expressing tetR and induced with different levels of atc. GFP fluorescence increases with the addition of ate and that increase is highest when ate is added at a rate of 600nM/hr. 5.5 Conclusions In this chapter, we explored the theory of resource segregation in cells. We integrated a GFP circuit on the genome of bacteria and transformed them with a plasmid that contains two circuits: One that is constitutive and one that is inducible. When the circuit on the plasmid is induced, we see a significant decrease in the level of the other circuit on that plasmid. On the contrary, the GFP circuit remains intact. This supports our initial hypothesis. 82 Chapter 6 Conclusions and Future Directions In this thesis, we addressed two problems that pertain to resource allocation in synthetic biological systems. First, we offered a solution to the problem of ribosome limitations by adding a high-gain negative feedback loop to one of the circuits. We started with a theoretical modeling of the system on hand and based on the simulations results, we started a physical implementation of the system. feedback loop has been built and characterized. So far, the The full system still needs to be implemented and tested according to the protocol followed by Jimenez et al. Ideally, we would expect the GFP to decrease when the RFP circuit is induced, but that decrease should be diminished when the feedback loop is turned on. What's important about the system designed here is the several degrees of control that one has over many of its parameter. This is crucial for testing and characterizing the effects of feedback. For instance, the feedback strength can be tuned by varying the concentration of AHL added to the system. As we show in this thesis, there is a gradient of feedback strengths that correlates very well with the concentration of AHL added. Furthermore, the GFP steady state level can be tuned by the amount of atc added. As expected, the steady state of GFP decreases when the feedback loop is added. This can be easily corrected by adding more atc to the media and decreasing the atc concentration for the system without feedback so that the steady states are comparable. Moreover, we also studied the theory of resource localization in bacteria. Our results 83 show that it is most likely that the resources in bacteria are spatially separated and that plasmids use a different pool than what is available to genomic genes. These results shed light on some crucial problems that synthetic biology faces. The cell is a closed system with limited capacities. Therefore, synthetic biologists should be aware of these issues when designing their circuits. On the other hand, this thesis is yet another example of how the efforts in the synthetic biology are revolutionizing biological design to make it similar to other engineering principles. In engineering, one starts with a hypothesis. Then, they apply the current knowledge to probe and design the system they would like to build. Finally, they choose the parts that will go into their system from a set of well established parts. Similarly, we started out with a hypothesis that was based on established theories in other engineering disciplines (feedback in control theory), and applied that to a biological system. Then, we modeled the system and based on the results of the model, we were able to define what the ideal system parameters would be. Finally, we were able to build a system that has those parameters by simply choosing characterized parts from a registry. Hopefully, in the upcoming years, this will become easier as more parts are characterized and there are established solutions to the hurdles such as resource limitations. 84 Bibliography [1 Erel Levine and Terence Hwa. Small RNAs establish gene expression thresholds. Current opinion in microbiology, 11(6):574-579, December 2008. [21 Kirill A Datsenko and Barry L Wanner. One-step inactivation of chromosomal genes in Escherichia coli K-12 using PCR products. Proceedings of the National Academy of Sciences, 97(12):6640-6645, 2000. [3] Jesse Stricker, Scott Cookson, Matthew R Bennett, William H Mather, Lev S Tsimring, and Jeff Hasty. A fast, robust and tunable synthetic gene oscillator. NATURE, 456(7221):516-U39, November 2008. 14] A Levskaya, A A Chevalier, J J Tabor, Z B Simpson, L A Lavery, M Levy, E A Davidson, A Scouras, A D Ellington, E M Marcotte, and C A Voigt. Engineering Escherichia coli to see light - These smart bacteria 'photograph' a light pattern as a high-definition chemical image. NATURE, 438(7067):441-442, November 2005. [51 J C Anderson, E J Clarke, A P Arkin, and C A Voigt. Environmentally controlled invasion of cancer cells by engineered bacteria. JOURNAL OF MOLECULAR BIOLOGY, 355(4):619-627, January 2006. 161 Christina D Smolke. Building outside of the box: iGEM and the BioBricks Foundation. NA TURE BIOTECHNOLOGY, 27(12):1099-1102, December 2009. f71 Jay D Keasling. Synthetic biology for synthetic chemistry. BIOLOGY, 3(1):64-76, January 2008. ACS CHEMICAL [8] H Kobayashi, M Kaern, M Araki, K Chung, T S Gardner, C R Cantor, and J J Collins. Programmable cells: Interfacing natural and engineered gene networks. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 101(22):8414-8419, June 2004. 191 Piro Siuti, John Yazbek, and Timothy K Lu. Synthetic circuits integrating logic and memory in living cells. NATURE BIOTECHNOLOGY, 31(5):448+, May 2013. [10] T S Gardner, C R Cantor, and J J Collins. Construction of a genetic toggle switch in Escherichia coli. NATURE, 403(6767):339-342, January 2000. 85 [11] M B Elowitz and S Leibler. A synthetic oscillatory network of transcriptional regulators. NA TURE, 403(6767):335-338, January 2000. 1121 Dan Peer, Jeffrey M Karp, SeungPyo Hong, Omid C FaroKHzad, Rimona Margalit, and Robert Langer. Nanocarriers as an emerging platform for cancer therapy. NATURE NANOTECHNOLOGY, 2(12):751-760, December 2007. 113] Miles Miller, Marc Hafner, Eduardo Sontag, Noah Davidsohn, Sairam Subramanian, Priscilla E M Purnick, Douglas Lauffenburger, and Ron Weiss. Modular Design of Artificial Tissue Homeostasis: Robust Control through Synthetic Cellular Heterogeneity. PLOS COMPUTATIONAL BIOLOGY, 8(7), July 2012. [14] Edward B Steager, Denise Wong, Deepak Mishra, Ron Weiss, and Vijay Kumar. Sensors for micro bio robots via synthetic biology. In Robotics and Automation (ICRA), 2014 IEEE InternationalConference on, pages 3783-3788. IEEE, 2014. [151 David F Savage, Jeffrey Way, and Pamela A Silver. Defossiling fuel: how synthetic biology can transform biofuel production. A CS Chemical Biology, 3(1):1316, 2008. 116] Kathryn E R Davis, Shayne J Joseph, and Peter H Janssen. Effects of growth medium, inoculum size, and incubation time on culturability and isolation of soil bacteria. Applied and environmental microbiology, 71(2):826-834, 2005. [17] Kaarina Sivonen. Effects of light, temperature, nitrate, orthophosphate, and bacteria on growth of and hepatotoxin production by Oscillatoria agardhii strains. Applied and environmental microbiology, 56(9):2658-2666, 1990. [18] Melissa B Miller and Bonnie L Bassler. Quorum sensing in bacteria. Reviews in Microbiology, 55(1):165-199, 2001. [19] Annual Stefano Cardinale and Adam Paul Arkin. Contextualizing context for synthetic biology - identifying causes of failure of synthetic biological systems. BIO TECHNOLOGY JOURNAL, 7(7, SI):856-866, July 2012. 120] Chunbo Lou, Brynne Stanton, Ying-Ja Chen, Brian Munsky, and Christopher A Voigt. Ribozyme-based insulator parts buffer synthetic circuits from genetic context. Nature biotechnology, 30(11):1137-1142, 2012. [21] Robert Sidney Cox, Michael G Surette, and Michael B Elowitz. Programming gene expression with combinatorial promoters. Molecular systems biology, 3(1), 2007. [22] Marco Boorsma, Lars Nieba, Daniel Koller, Martin F Bachmann, James E Bailey, and Wolfgang A Renner. A temperature-regulated replicon-based DNA expression system. Nature biotechnology, 18(4):429-432, 2000. [23] Domitilla Del Vecchio, Alexander J Ninfa, and Eduardo D Sontag. Modular cell biology: retroactivity and insulation. Molecular systems biology, 4(1), 2008. 86 [24] Shridhar Jayanthi and Domitilla Del Vecchio. Tuning genetic clocks employing DNA binding sites. PloS one, 7(7):e41019, 2012. [25] Domitilla Del Vecchio and Richard M Murray. Biomolecular Feedback Systems. Princeton University Press, 2014. [261 Yili Qian and Domitilla Del Vecchio. Effective interaction graphs arising from resource limitations in gene networks. 1271 Meghdad Hajimorad, Paul R Gray, and Jay D Keasling. A framework and model system to investigate linear system behavior in Escherichia coli. JOURNAL OF BIOLOGICAL, ENGINEERING, 5(1), 2011. 1281 Domitilla Jimenez, Jose; Gyorgy, Andras; Yazbek, John; Huang, Hsin-Ho;Chung, Hattie;Weiss, Ron; Del Vecchio. Isocost lines describe the cellular economy of genetic circuits. Biophysical Journal, (In review), 2015. [291 Peter Stougaard, Soren Molin, and Kurt Nordstrdm. RNAs involved in copynumber control and incompatibility of plasmid R1. Proceedings of the National Academy of Sciences, 78(10):6008-6012, 1981. 130] Robert W Simons and Nancy Kleckner. Translational control of IS10 transposition. Cell, 34(2):683-691, 1983. [31] Paula J Wilderman, Nathaniel A Sowa, David J FitzGerald, Peter C FitzGerald, Susan Gottesman, Urs A Ochsner, and Michael L Vasil. Identification of tandem duplicate regulatory small RNAs in Pseudomonas aeruginosa involved in iron homeostasis. Proceedings of the National Academy of Sciences of the United States of America, 101(26):9792-9797, 2004. [32] Gisela Storz, J6rg Vogel, and Karen M Wassarman. Regulation by small RNAs in bacteria: expanding frontiers. Molecular cell, 43(6):880-891, 2011. [33] Karen Montzka Wassarman, Aixia Zhang, and Gisela Storz. Small RNAs in Escherichia coli. Trends in microbiology, 7(1):37-45, 1999. [34] Philippe Horvath and Rodolphe Barrangou. CRISPR/Cas, the Immune System of Bacteria and Archaea. SCIENCE, 327(5962):167-170, January 2010. [35] Eric Masse, Carin K Vanderpool, and Susan Gottesman. Effect of RyhB small RNA on global iron use in Escherichia coli. Journal of bacteriology, 187(20):69626971, 2005. [36] Richard G Brennan and Todd M Link. Hfq structure, function and ligand binding. CURRENT OPINION IN MICROBIOLOGY, 10(2):125-133, April 2007. [37] Erel Levine, Zhongge Zhang, Thomas Kuhlman, and Terence Hwa. Quantitative characteristics of gene regulation by small RNA. PLOS BIOLOGY, 5(9):19982010, September 2007. 87 138] Seung Min Yoo, Dokyun Na, and Sang Yup Lee. Design and use of synthetic regulatory small RNAs to control gene expression in Escherichia coli. protocols, 8(9):1694-1707, 2013. Nature [39] Joshua T Kittleson, Sherine Cheung, and J Christopher Anderson. Rapid optimization of gene dosage in E. coli using DIAL strains. J Biol Eng, 5(10), 2011. [40] Thomas H Segall-Shapiro, Adam J Meyer, Andrew D Ellington, Eduardo D Sontag, and Christopher A Voigt. A YJiresource allocatorififor transcription based on a highly fragmented T7 RNA polymerase. Molecular systems biology, 10(7), 2014. 141] Ron Milo, Paul Jorgensen, Uri Moran, Griffin Weber, and Michael Springer. BioNumbers~i2the database of key numbers in molecular and cell biology. Nucleic acids research, 38(suppl 1):D750-D753, 2010. [421 Rolf Lutz and Hermann Bujard. Independent and tight regulation of transcriptional units in Escherichia coli via the LacR/0, the TetR/O and AraC/I1-12 regulatory elements. Nucleic acids research, 25(6):1203-1210, 1997. 88