CDB 2012 H'out Lecture 1 copy - Martinez-Arias Lab

advertisement
1
Gene expression and cell decisions Alfonso Martinez Arias Department of Genetics University of Cambridge Email: ama11@hermes.cam.ac.uk Lecture1 The lac Operon: essential features of a genetic regulatory circuit for an inductive system. Lecture 2 Lambda: a genetic switch, the circuitry of a decision and the programme it regulates Lecture 3 Gene expression in eukaryotes: s. cerevisiae Gal4 as a guide for universal principles of transcriptional regulation References The topics that we shall cover in these lectures are classic topics of molecular biology and basically any text book on Genetics will tell you a version of each of them. In addition, today you can find much in the web -­‐often the issue is more a matter of knowing how to search what you want to know rather than having a fixed source for it-­‐ so you should roam and look. Notwithstanding this, there are two books which you will find helpful and which will give you a broad perspective on the subjects: Ptashne, M. (2004) A genetic switch. 3rd edition. Blackwell. (in particular chapters 1-­‐4). Ptashne, M. and Gann, A. (2002) Genes and signals Cold Spring Harbor Press (particularly chapters 1 and 2). Your libraries will have them. Also here you have a small number of additional references which will help you broaden the material of the lectures. A few other helpful, but by no means compulsory reading: • Dodd, B., Shearwin, K. and Egan, JB. (2005) Revisited gene regulation in bacteriophage lambda. Curr Op in Genetics and Development15, 145-­‐152 (a bit dense but once you have the hang of λ it will be interesting) • Driever, W., Ma, J., Nusslein Volhard, C. and Ptashne, M. (1989) Rescue of bicoid mutant Drosophila embryos by Bicoid fusion proteins containing heterologous activating sequences. Nature 342, 149-­‐152 (this is an important set of experiments, though the paper has more information than you need. Still, it is the reference), • Hahn, S. and Young, E. (2011) Transcriptional regulation in S. cerevisiae: transcription factor regulation and function, mechanisms of initiation and roles of activators and coactivators. In Yeastbook. Genetics 189, 705-­‐736 (some of its sections useful but more a resource than a central reference; it is clear though) • Murray, N. and Gann, A. What has phage lambda ever done for us? Curr Biol. 17, R305-­‐R312. 2
•
•
•
•
•
Oehler, S. and Muller-­‐Hill, B. (2010) High local concentration, a fundamental strategy of life. J. Mol Biol 395, 242-­‐253 (an interesting read about some fundamental principles of chemistry applied to the Lac operon which will give you some references for the lectures. The good thing is that his is about principles) Ptashne, M. (2011) Principles of a switch. Nature Chem. Biol. 7, 484-­‐487 (like all Ptashne’s, variation on his favourite themes but also, as all Ptashne’s, VERY CLEAR) Ptashne, M. (2005) Regulation of transcription. From Lambda to eukaryotes. Trends in Biochem. Sci. 30, 275-­‐278. (as above) Small, S., Blair, A. and Levine, M. (1992) Regulation of even-­‐skipped stripe 2 in the Drosophila embryo. EMBO J 11, 4047-­‐4057 (this is a classic and worth a read). Traven, A., Jelicic, B. and Sopta, M. (2006)Yeast Gal4 a transcriptional paradigm revisited. EMBO Reports. 7, 496-­‐499 (a useful account of Gal4 activity; a bit more than you need to but the first half certainly useful) Lecture1 The lac Operon: essential features of a genetic regulatory circuit for an inductive system One of the interesting issues in Biology is the diversity of organisms and, within each organism, the diversity of cells that make them up. As we know, the differences between cells are due to their constituents (a skin cell is different from a muscle cell because of the way they are built which, in some sense, is fit for purpose), but the question becomes how are these differences achieved? The issue can be addressed through a small number of related questions: 1. What is the molecular basis for these differences? 2. How can we found about it? 3. How many ways are there to be different? If you did not know anything about the mechanisms that underlie the working of living systems, but knew about their constituent elements, and had to take a guess about where the molecular basis for those differences lie (we are reductionistic), we would point the finger to the DNA. After all, if in these days of massive DNA sequencing you looked at the sequence of different organisms, you would find that they are different and, on the other hand, organisms that look similar have similar DNA, and in the case of twins, identical. However, this is not the explanation because within an organism there are many different kinds of cells and the DNA in these cells is the same. DNA is the template for RNA and if we use another very useful technique that is popular these days and which allows us to look at gene expression (sequencing to mRNA populations through microarray analysis or related techniques) we would find that different cells, express different bits of DNA or, what is the same, cell types are associated with different parts of the genome. Thus, we could infer that the differences between cells arise because different cells express different genes and that this has something to do with the differences we observe between organisms. This conclusion is underpinned by some interesting experiments which, in an important manner, preceded, these technologies. In the 1960s the question of how cells become different during development hovered over Biology. When the structure of DNA was found, a sensible explanation would have been that, as we have mentioned, different cells would have different DNA, that somehow DNA 3
would be lost in cell specific manner. Not as far fetched as it sounds but it needed testing (Biology is not just a collection of facts but of questions that an be answered experimentally) and John Gurdon did the experiment. Maybe differentiated cells had lost DNA they did not need. If this were the case, the nucleus from a differentiated cell could not support the development of a whole organism. The test: take the nucleus of a differentiated cell, place it in an enucleated oocyte and see what happens when you activate development. These experiments was done in frogs and they succeeded with different types of cells, thus making the point that the DNA of an organism is maintained intact during development and that therefore, the differences between cells must be due to differential expression of the DNA. Conclusion: understanding how different bits of DNA come to be expressed in different cells is central to understanding how cells become different and therefore how organisms are put together. How do we find out: Genetics and Biochemistry Whatever way cells use to choose which bit of DNA to transcribe, it is likely to involve a mechanism that is reliable and reproducible. This mechanism and its variations is the core of these lectures. But before we get into the real subject matter, one issue of method. You probably have seen all this before, but there is no harm in bringing it up again, perhaps from a different perspective: understanding biological processes always requires a combination of Genetics and Biochemistry and if one can work with both at the same time, much the better. Genetics is much more than a narrative about genes. Genetics is a language that allows us to formulate questions in precise manners, it has a set of rules that allows us to explore those questions and get an answer. In many ways genetics is to Biology what Maths is to Physics. Unfortunately Genetics alone is not enough, and Biochemistry, the chemical analysis of the processes revealed by Genetics, is what brings the mechanistic element into the picture. Genetics gives you function, Biochemistry gives you mechanism. The combination is powerful. Let us see this with an abstract example. Suppose we need to figure out a situation in which a cell divides and one of the daughters changes its state from A to B i.e. we go from A to A and B. A way to analyze this process is to look for mutants in which this process goes awry. For example in which A does not divide (this does not teach us very much), or in which A gives rise to two A cells or to two B cells, this is more informative. Once we have these mutations, we can organize them into complementation groups –which will tell us if they map to one gene or more than one gene-­‐ and analyze if they are dominant or recessive. Then, if we have them into more than one group, we can use epistasis analysis to explore the way these are organized (remember your basic genetics courses) And in the terms of the processes that you have been seeing, imagine that you have a protein that binds DNA and, somehow, it does something to transcription/gene expression. Genetics allows you to unravel “the function of the gene”. If you mutate the protein or the DNA, the outcome is telling: No expression in the mutant means that Protein + DNA promotes gene expression. 4
Expression in the mutant means that Protein + DNA promotes repression of gene expression. Now, in order to understand Genetics let us get a glossary of terms beginning with mutant (✸) classes in a diploid organism, and how to interpret them: If: ✸/✸ < +/+ (meaning that the functiom of ✸ is less than the wild type (+)) is a loss of function mutant.
If: ✸/✸ > +/+ this is a gain of function mutant.
If: ✸/+ ≠ +/+ this is a dominant mutation
If: ✸/✸ < ✸/+ = +/+ this is a recessive mutation Now, if a mutation acts only on the chromosome on which it resides, it is said to act ‘in cis’, whereas if it can act on a different chromosome, it is said to act ‘in trans’. There are also mutations which only exhibit a phenotype at a particular temperature, these are said to be temperature sensitive mutations and are an example of a conditional mutation (we shall see examples of how these are used in the lecture on lambda). Genetics is the best way to begin to tease apart biological systems but then one needs the Biochemistry to interpret the mechanisms suggested by the Genetics. Now let us get into the main topic: the molecular mechanisms that allow cells to decide which genes to express and when. It should not be a surprise to you that these decisions are not restricted to eukaryotes and that bacteria and viruses also have to wrestle with them. It is very interesting and exciting to think about how a cell decides to be leg or arm, or nose or ear, but perhaps learning about simpler decisions, which in the abstract, are similar, we can learn something. Also, it is important to be reductionistic and the notion of a decision can be generalized. Furthermore, analyzing these processes in prokaryotes provides insights which, as we shall see, guide our understanding of the processes in eukaryotes. The lac Operon In the 1940s, as he was fighting in the French Resistance, J Monod was investigating the biology of E coli growth, trying to figure out the simple observation that when presented with two sources of carbon, Glucose and Lactose, E coli would metabolize Glucose first and then, after a lag, move on to use Lactose. Genetics is abstract and has symbols and here I just want us to focus on the essentials and want you to realize, because this will serve you in the rest of your lives as biologists, that understanding how to use genetic reasoning, will save you a lot of time and give you a lot of insights when trying to understand a process. J. Monod and his colleague F. Jacob, began to isolate mutations that would affect the growth characteristics of E. coli in these carbon sources. From the analysis of these mutations they develop a model which Biochemistry proved to be right and which serves as a basis for the way we think about gene regulation. You will see as we develop it, echoes from Steve Jackson’s lectures, but you will also see that we begin to build the foundations for the operation of a regulatory system. E. coli can use glucose or lactose as a source of carbon and energy. As lactose is a disaccharide and needs to be hydrolyzed, E. coli prefers to use glucose. The enzyme that 5
deals with Lactose is ß-­‐galactosidase. So, how does E. coli make the choice? What regulates the use of lactose? How is it kept off while there is glucose around? How is it turned on? Jacob and Monod focused on mutations that would allow or forbid the cells to grow on lactose, the second carbon source. Lactose requires an enzyme: ß-­‐galactosidase which is encoded in the lac Operon (a set of genes in a chain that is required to deal with lactose) and using simple assays for ß-­‐galactosidase one can look for mutants that either cannot make ß-­‐
galactosidase when they should or that make ß-­‐galactosidase when they should not. These mutants provide insights into the molecular mechanisms that allow E. coli. Lac genetics Genetic analysis of these mutants (complementation, dominance, cis/trans) led Monod and Jacob to an interesting set of hypotheses: (i)
that the ability to grow (or not) on lactose was associated with changes in gene expression (this is important because people, at the time, could only think in terms of proteins and considered that when changing from glucose to lactose, the cell simply changed the proteins it used rather than the proteins it made) (ii)
most mutations identified genes encoding enzymes for lactose metabolism (ß-­‐galactosidase (Z), its permease (Y) and a transacetylase (A)) (iii)
some mutations identified genes encoding proteins that regulate whether and when cells made ß-­‐galacosidase (I). (iv)
Some mutations could only be interpreted as regulating or controlling the ability of a cell to make ß-­‐galactosidase (O, P). Let us look at these mutations in turn: Mutations in Z, Y and A showed some characteristics which led to the notion that the genes for Z, Y and A were organized in a line, an Operon. They are transcribed as part of the same and unique mRNA and translated in series. Operons are very common in prokaryotes and less so un eukaryotes. All mutations (Z, Y, A) are recessive and cis i.e. in diploid (which in E. coli are made ingeniously with special factor, F factors -­‐see D. Summers lectures-­‐, or 6
nowadays plasmids; Z+/Z-­‐ makes ß-­‐galactosidase. Although some mutations in Z also affect Y and A (and mutations in Y, affect A), in general mutations in any of the three genes (complementation groups) only affect that gene; they identify standard genes. Mutations in I are a bit more interesting because, although they do not map near ZYA they affect their expression: ZYA are intact and yet a mutation in I leads to either expression of the three or expression of none. The i mutations that lead to expression of ß-­‐galactosidase are recessive and trans acting i.e. i+/i-­‐ is wildtype and has a normal regulation. The i mutations that repress expression of ZYA are dominant and trans acting. A third class of mutations is the most interesting one and it is a tribute to Jacob and Monod that they figured out how they work and what they tell us about genes before they knew about their molecular basis. Whereas I, Z, Y and A code for proteins a set of mutations called O, affect the activity of I and the expression of ZYA but there is no protein associated with them. Furthermore, these mutations seem to act in cis. As it turns out O identifies a region in the DNA to which I binds and determines whether ZYA are expressed. There are two main kinds of O mutants: • Oc (c for constitutive): these mutants lead to the expression of ZYA, even in the absence of lactose. The mutations are dominant and cis acting i.e. oc z+/ o+ z-­‐ leads •
to constitutive expression of ß-­‐galactosidase whereas oc z-­‐/ o+ z+ leads to regulated expression. Os (s for superrepressor): in these mutants, there is no expression of ZYA. Using the same tests, they are cis acting and dominant. 7
P turns out to be the site of binding and initual activity of RNA polymerase. Notice the relative location of P and O i.e. O can block the activity of RNA pol. The Molecular Biology (biochemistry) Biochemistry allows an interpretation of the genetic experiments. ZYA encode the structural genes, enzymes. I encodes a protein that acts as a repressor for the expression of ZYA. It binds to O, which is a site in the DNA that overlaps the promoter (P) which is the RNA polymerase binding site; when I is bound, it does not allow the polymerase to initiate the expression of ZYA. The function of I is to block the progress of RNA polymerase and thus inhibit gene expression. An Oc mutant results in constitutive ZYA expression, because it cannot bind the repressor. In Os the repressor binds so tightly that it never induces. In I the repressor binds and can never be released. The interactions between the proteins and the DNA rely, as you know, on precise structural details and the binding of the repressor I to O depends on whether or not allolactose, a derivative of lactose, is present or not. In the absence of allolactose, I binds and represses expression but when there is lactose around, allolactose is formed, binds I and induces a structural change that precludes its binding to DNA and thus allows polymerase to transcribe the ZYA (lac) operon. The choice: glucose or lactose? OK, we understand how the system functions when there is lactose, and we can explain how the cell is not going to waste energy transcribing ß-­‐galactosidase when there is no lactose around but, how does it make the choice when it is presented with a mixture of glucose and lactose? how does it ‘know’ how to use glucose first? Here, again, Genetics comes to the rescue because, in addition to the mutants in I, O, Z, Y and A, there is another class of mutations: if we look at the mutations that alter the expression of ß-­‐galactosidase, we find another set of mutants which affect the expression even in the presence of Lactose and which do not map to any of the genes we know about. They map to a gene encoding CAP (Catabolite Activating Protein). If this protein is mutated, expression of ZYA is low in the presence of lactose, even if I is mutated. CAP binds cAMP, the amounts of which are regulated by glucose metabolism, and understanding its function provides an explanation for the switch between glucose and lactose when both are present. CAP binds near P and helps RNApolymerase to transcribe ZYA, but it does so only in the presence of cAMP. As cAMP is suppressed by glucose, the activity of CAP goes down and this explains why expression of lacZYA is inefficient or absent in the presence of glucose. When glucose is low, cAMP rises, binds to CAP enhancing its ability to form dimers, bind DNA, and increase the rate of transcription of the ZYA. s
8
Now we can understand what happens at the molecular level in the different conditions that set the problem in the first place (try for yourself to write down the molecular state of the system when there is glucose only, lactose only or both). Write a table with the different genes, the activities in the different carbon sources; you can also play with mutants. Some (important) molecular details The study of the Lac Operon provides some insights into fundamental aspects of the molecular basis of transcriptional regulation and the way it is used by cells to make decisions about their physiology or their fate. Specificity of transcriptional regulation: recognition helix of the repressor (there is just one α−helix in the protein) can interact with about 5bp of DNA, but this is not enough recognition to make the sequence unique –and therefore allow a unique code for regulation-­‐ nor to make the binding strong in molecular terms –do not forget that we are looking at a chemical system in which the interaction between a protein and DNA is dynamic-­‐ . The E coli genome is about 5 x 106 bps. If one wants to have a unique sequence (so that unequivocal specificity is achieved) one needs, at least, 11bp (411 which is the probability that would make the sequence unique, is 4.2x106, the genome), so rather than changing the structure of the protein in an implausible manner –given the structure of the DNA an interaction surface with more than 5bp might be complicated-­‐, what evolution seems to have achieved is to create a duplication of the site i.e. the the repressor binds as a dimer. As we shall see this might be a general principle: palindromic sequence or direct repeats. Cooperativity: when trying to establish a state, one wants to create conditionality i.e. for the Lac Operon to operate optimally you do not want it to use too many repressor molecules, because then it might not be possible to counteract their effect, nor too few, because then they could never repress. The Lac Operon is a biochemical system, which means that the repressor is binding and unbinding all the time and that it still works i.e. it represses. Thus, given the Kd of the interaction between the repressor and the DNA, the repression will be determined, in principle, by the number of molecules of I. It so happens that on the average there are five (5!) molecules of the repressor per cell, which is about 2 dimers. It is here where the effect of the dimer becomes important: when one molecule binds, it can unbind easily but when another molecule binds in the nearby site, the two hold each other longer in place, this is called cooperativity and highlights the significance of the dimer and the palindromic site: it serves to increase the stability of the interaction between the repressor and the DNA. The advantages of cooperative binding and ‘regulated recruitment” are obvious because it helps using less molecules for a regulatory event and therefore contributes to the specificity of the event and the sensitivity of the system. Interestingly, in addition to the original Operator, there are two additional Operators, with lower affinity for the repressor, located on either side of O (O1), O2 and O3. The repressor binds to these sites too and, as they are far from the original O, they are held in place through DNA looping and tetramers of I. These sequences play a role in the repression because they increase cooperativity 9
and, with the looping, fix the DNA and the repression. The repressor tends to form tetramers in vitro. Thus, if there are 5 molecules of the repressor per cell, if the tetramer is the working unit of the lac operon, this means that there is about one functional molecule of repressor per cell. There are consequences of this. Regulated recruitment: Another feature of interactions between proteins that regulate transcription and DNA is a by product of cooperativity is what we can call “regulated recruitment” and is exemplified by the effects of CAP. The interaction of CAP with RNA polymerase enhances the affinity of polymerase for the DNA and increases its function i.e. it is cooperative but between two proteins and, in some manner, CAP promotes the recruitment of RNA polymerase to the site of transcription initiation. The interaction is mediated by a specific sequence in CAP. Two small matters at the end: the molecular basis of physiological memory And how does it all start? Lactose needs the Permease to get into the cell and the permease is encoded by the Y gene which, in the absence of Lactose is, like the rest of the Operon, repressed by I. How does the Lac Operon then get derepressed in the first place? The idea is to remember chemistry and that there is about one molecule of functional repressor per cell i.e. as the repressor binds and unbinds, with just one molecule per cell, when it is off nothing can take its place and this will lead to a small leakage of gene expression and that will allow a few molecules of the permease to be made and thereby bring in some Lactose which will become important when the glucose is off and the CAP dimer boosts the expression of ZYA. The system is poised or primed for activity. This is important because in the presence of both glucose and lactose, there is lactose (!) and therefore there is every chance for the Lac Operon to leak and it probably does. Memory, feedbacks and circuits One interesting observation about the Lac operon arises when intermediate levels of inducer are applied to the system. Say we grow cells at an intermediate concentration of lactose that results in some cells being fully induced and some not being induced at all –all depending on the state of the repressor at the time of the exposure-­‐. Interestingly when the induced cells are selected and placed at this intermediate concentration of lactose again, all remain induced. In contrast the uninduced ones exert their right of activating or not. The reason for this lies in the Permease, encoded by the Y gene of the Operon, which once expressed at a certain level creates a positive feedback loop that maintains the system active: the permease will bring in lactose, which will inhibit the repressor and this will bring in more lactose…….. Positive feedback loops are very important in the maintenance of the activity of genetic circuits and we shall see more of this in the next lecture. They create small memories of the input and not only allows for the economic maintenance of the system but they also provide useful insights into regulatory logic that you will use later. 10
Summary Thus we can see how to establish a small genetic circuit that will utilize effectively and on demand carbon resources. The system is very simple from the regulatory point of view: it is all about establishing that positive feedback loop involving the permease. We have also learnt two molecular principles that we shall see in action soon: cooperativity, regulated recruitment and memory. The three are likely to be the consequence of natural selection and they are all geared to the maintenance of the organism, sometimes against the organism’s will. The system works on demand and the feedback loop support the economy and effectiveness of the regulatory system. 11
Lecture 2 Lambda: a genetic switch, the circuitry of a decision and the programme it regulates Cells make choices about what they want to do, what they want to be and they use molecular mechanisms that underpin those choices. In the last lecture we saw how E. coli makes a choice, a useful choice, to metabolize Glucose before Lactose as a carbon source when presented with both. From an engineering point of view, it is a question of IF this, then NOT that; what an electrical engineer would call a “NOT gate”. Today we are going to see a different decision and thereby a different mechanism: the decision of a bacteriophage, lambda (λ), whether to kill or not to kill an E. coli cell. The underlying molecular process driving this decision is similar to the one E. colu uses to utilize Lactose and, together the two systems provide a basis to think about more complex molecular devices that mediate decisions, more complicated decisions, in eukaryotes . Bacteriophages are bacterial parasites and λ is one of them that has been studied in detail. When it infects a bacterium, λ has a choice, either to kill the cell, and in the process make many copies of itself, or to become a passenger, by integrating its DNA in the bacterial chromosome…….and kill the cell later. We know this because one can look for variants of the phage. Infecting a lawn of E. coli at a low density (such that individual phages are separated in the plate) one observes well defined individual plaques; and there are two different kinds of plaques: nice, round clear plaques (O) or turbid plaques (ø). The clear plaques reflect a chain reaction in which the bacteriophage sequentially kills bacteria as it grows i.e. one phage generates many triggering a chain reaction and thus, because of the lysis that is observed, all bacteria al killed and the plaques are called ‘lytic’. The turbid plaques reflect the observation that sometimes λ kills and sometimes it does not, and that when it does not, it grows with, within actually, the bacterium giving rise to ‘lysogens”. Lysogens have an interesting property: bacteria with a lysogenic phenotype are immune to further infection i.e. if one tries to infect a lysogen with the product of a lytic plaque, one cannot do this and the resulting plaque remains opaque. Conversely, lytic plaques never give rise to lysogens when infecting other bacteria. However, if one takes a lysogen and treats it with UV light, one induces lysis. As in the case of Lac, a !
combination of genetics and biochemistry provides an explanation for this observations and a model of how the system functions. Thus, we have a series of observations to explain and explore • What is the molecular basis of lysis and lysogeny? • How does a phage decide what to do? • What is immunity? • How does UV act to induce lysis? Looking for mutants (remember that this is what Genetics is about) that affect the ability of λ to grow lytically or lysogenically, and applying the rules that we laid down in last lecture 1, we can find something about the mechanism that mediates the decision of the phage. Looking for mutants that produce plaques, we find four complementation groups associated 12
with these plaques: cI, cro, cII and cIII, but these mutants exhibit different behaviours when we subject them to genetic tests. In all cases one can isolate clear (lytic) plaques, but sometimes it is possible to isolate lysogens from cII and cIII mutants, which are stable, immune and inducible by UV light, but this is never the case for cI, which only produces clear plaques. On the other hand none of the three mutants cI, cII and cIII can grow on a lysogenic culture i.e. they cannot lyse a lysogen, which, therefore, is immune to these mutations. However, occasionally one can isolate a mutant which does this, these are called λvir and represents an interesting and useful exception. ts allele
growth
on
lysogen
lof
30oC
40oC
cI
clear
turbid
clear
no
cII
clear
turbid
turbid
yes
cIII
clear
turbid
turbid
yes
cro
turbid
yes
!vir
clear
no
Given what we learnt about the Operon, an interpretation of the mutants and their relationships suggest the following picture for the activities of λ phage : • cI is a repressor of lysis and therefore, if it is mutated, λ will always lyse. It is called the Repressor (because its function, normally, is to repress; notice that many genes are named on the basis of their loss of function phenotypes and not on the basis of what the proteins they code for do). • cII and cIII seem to work to establish lysogeny but they are not required for its maintenance and therefore when mutated can, occasionally, produce turbid plaques. A good example of this is provided by the different behaviour of temperature sensitive (ts) mutations in cI, cII and cIII. It is possible to isolate lysogens for the three mutants at the permissive temperature, 30oC. However, when they are raised to the non permissive temperature, 40oC, cIts lyses but cIIts and cIIIts continue to grow as lysogens. As normal mutants in cII and cIII lyse, this observation suggests that their products are required for the initiation or establishment, but not the maintenance, of lysogeny. On the other hand cI is required for both. We shall see below what is the role of cII and cIII. • Cro represses lysogeny, so when mutated yields lysogens Corollary: there must be a regulatory relationship between cro and cI. • λvir mutants are not easy to interpret, but the fact that they are rare, suggests that they might map to some sort of control region involved in repression and which is unable to bind the repressor. These mutations are cis. • Immunity must be due to the existence of high levels of the repressor protein in the host which will suppress the lytic programme in other phages. Upon infection the DNA of the new phage must be taken over by the Repressor of the resident phage and this block lysis. • UV must target the Repressor, break it down. 13
Now we have a picture and therefore we need a mechanism; once we have the genetics, we need biochemistry (or cell biology, as it may be the case) to interpret the phenotypes and this is the same here. If you are interested in details look up Ptashne’s book (in particular chapters 1-­‐4, with chapter 4 deserving special attention; the rest of the book, dense, but there are useful things if you are curious). There are two fundamental molecular components to the system: 1. A very high level control region that determines the sequence of events 2. A programme, a genetic programme, that follows from the activity of (1). We are again in one of those situations of “NOT gates”. The control region works in such a manner that: if A, then lysis; if B (or NOT A), then lysogeny. Lysogeny is a status quo that is established and maintained by a molecular event. Lysis requires, and is associated with, a genetic programme that leads to phage building. IMPORTANT NOTE: This is the first time that we have encountered the notion of a programme and therefore it is important to pause and, at least, mark this event. A genetic programm is a sequence of transcriptional and translational events that change the state of a cell and can generate something new a state and lead to new programmes. They are very important in development. In the case of a λ lytic the programme results in the building of new phage and the lysis of a bacterium. Let us see how this works and how can we make sense of the genetics. The molecular elements: control As we already have stated, the system has a control region, in the DNA, and also a programme, which is clearly mediated by proteins. The decision making (to lyse or not to lyse) is dependent on the interactions between the DNA and the proteins. The operative system is a control region, an Operator -­‐
the OR region that is mutated in λvir-­‐, and four proteins cro, cI, cII, cIII, which emerge as complementation groups in the genetic screens. Genetic analysis indicates that the cro and cI (Repressor) proteins play different, but related, roles in the state of the cells, but it is the biochemistry -­‐and then the combination of genetics and biochemistry-­‐ that tells us how it works, because this is associated with their different structures which matter for the mechanism and the decision. The cro protein is made up of a simple globular domain that binds DNA. On the other hand, cI is made up of two globules, linked by a hinge –this will suffice for all practical purposes-­‐; one of the globules binds DNA and another can bind to another copy of itself. This, and the existence of duplicated binding sites on the DNA (see the end of last lecture), leads to dimers as the devices that mediate function (both cro and cI function as dimers). 14
Molecular analysis reveals that cro and cI bind to a region of DNA, the operator OR –a very similar situation to the one we discussed in the last lecture about Lac-­‐ which is subdivided into three linked sections, OR 1, OR 2 and OR 3, which seem to overlap with two promoters: PRM and PR (see Fig). If PRM is active, more cI is made (notice the “more”, we shall come back to this later i.e. how is the expression of cI started?) and the phage integrates in the bacterial chromosome. If PR is active, the phage will lysogenize; one of the elements for this is the synthesis of cro. The game is that cI and cro battle for binding to OR as this will determine which promoter will be active and which programme will be executed (insertion and lysogeny or lysis). The battle is a molecular competition based on affinities and driven by concentrations (see end of last lecture). As the DNA is constant, it is all down to the relative concentrations of cro and cI and to the fact that OR 1, OR 2 and OR 3 have different affinities for both, cro and cI. 1. cI binds, preferentially and first, at OR2 and this has two consequences: it prevents binding of the RNA polymerase to PR and thereby inhibits the synthesis of cro. At the same time, it helps RNA polymerase bind to PRM and promote the expression of cI (in reality, more cI-­‐ we shall see shortly how the battle starts). So, cI at OR2 → cI
(more) expression. 2. cI can also bind OR3 but, when it does, here it has different effects: it cannot activate PRM and it cannot inhibit PR, which then triggers gene expression. So, cI at OR3 →
cro expression. 3. cI may bind at OR1 but here it cannot activate PRM and blocks PR. So, cI at OR1 → no
expression of cro nor of cI. 4. Cro binding blocks the binding of cI and thereby determines its occupancy of OR. It turns out that the affinities of cI and cro are mirror
image of each other. In the case of cI: OR1 > OR2 > OR3,
whereas in the case of cro: OR3 > OR2 > OR1. Under normal
conditions, as indicated by Ptashne, if one takes a snapshot of OR in a lysogen one will find, on average, that 90% of the cells have cI at OR1 and OR2 and only in about 10% of the cells the three operators are occupied. These indicates that the determinants of the outcome are, as in any chemical system, the affinities of the proteins with a variant (the interactions between the repressor dimers). The relative affinities and concentrations of cro and cI From M. Ptashne 2011
provide the key for the state of a cell (see Fig). Nat Chem Biol
15
Altogether, these observations provide a basis to understand the cI and cro mutants. It also provides a basis to understand λvir which must be a cis mutation (in the operator) and we can now explain the odd observation that λvir mutants cannot enter lysogeny, probably because being mutants in the operator, would not allow for the repressor to seat in or would favour the binding of cro. They are cis but because of the nature of the phage growth, they can be understood as cis dominant (but have a weird twist). Activation of PR leads to the expression of several lytic genes which will help the phage reproduce and then kill the cell.
Factors affecting the decisions: the establishment Now we know how the key control region works, but how does λ ever get to the point of making a decision? How does λ set up the decision? How does the system set itself up for the decision which we know is ‘random’ or, if you wish to be more precise ‘stochastic’, at least at first sight? A combination of classical genetics (mutant analysis and epistasis) and molecular biology (looking at DNA, RNA and proteins), suggests the existence of three phases in the process: Very early, Early and Late. We can infer that there are different phases from the mutants (remember, for example, the different behaviour of the ts mutants in cI, cII and cIII). The decision, the game at the Operator, is made some time between the Early and the Late phases, and understanding this process will give us an insight into WHAT, if anything, mediates the decision, and perhaps provides a basis for the molecular basis of decision making. Let us have a coarse grained look at the different phases. Very early: When the phage injects its DNA, polymerase binds to two promoters : PR and PL and leads to the transcription of cro (from PR) and of an interesting protein called N (made from PL ) whose job is to enable the functioning of RNA polymerase at DNA sequences that are energetically difficult-­‐ remember that this is a product of chemistry and evolution. This extends transcription from both PR and PL Early: During this phase, N allows the extension of the transcription from sites close to PR and PL to a number of genes that set up the stage for the decision making process. From PL it leads to the expression of genes that will set lysis in motion. From PR it leads to the expression of genes involved in integration (int) and two old friends of ours: cII and cIII, which remember were identified as lysogenic mutants. cII is an activator of transcription which is degraded by proteases and cIII protects cII from degradation. cII leads to the expression of cI from a promoter called PRE (for promoter repressor establishment). And then a competition ensues between cro and cI in which the goal is to see whether lysogeny (a process that is cI dependent) can be established. Late: this is the decider and it is here where the pathways bifurcate. The outcome of this phase depends on whether the process is lytic or lysogenic. If cII manages to make enough cI so that it establishes its position at OR1 and OR2 , it will establish expression from PRM and it will be a lysogen. Otherwise, cro will always outcompete cI and lysis will set in. 16
Infection
Proteases
cII
p
RE
Lysis
cI
Lysogeny
In summary: If Cro and Q are ON: lysis will ensue. If cII, cIII and then cI are ON, then lysogeny will ensue. The decision: what decides? A bistable state. Bistability. Now, we know all the elements of the system and what does it take for the coin (and it is the toss of a coin) to go one way or another. How is then the decision made? How many levels are there are to the decision? How are they regulated and enacted? As we have seen, the key element is the relative ratio of cI and cro. As initially cro is made, the decision hinges on the ability of the phage to pump enough of cI to set up stable transcroption of the repressor and this depends on whether cII can get to make enough of cI 17
from PRE. The stability of cII is dependent on proteases from the cell and this can reflect the state of a cell: growth in rich medium will activate proteases and lead to lysis –because the levels of cII will be low and it will never be able to establish the levels of cI needed for lysogeny. On the other hand, starvation will lead to low levels of proteases which will result in higher levels of cII which will lead to enough cI to get the positive feedback loop going, and it is a positive feedback loop, and lysogeny established. This system with two alternative states which repress each other but self reinforce is called a bistable system. Such systems always result in one or the other state, there is no cro
cI
possibility of long term coexistence of both. Lysis
Lysogeny
Summary: a comparison with the lac operon Although the Lac operon and the Lysis-­‐Lysogeny switch, each has its own special issues, they also have much in common and it may be worth reflecting on these similarities, particularly in the context of the general principles discussed at the end of lecture 1. • In both instances there is an Operator, a cis acting control region which acts as a landing pad for DNA binding proteins that control the activity of RNA polymerase. • In both cases there is cooperativity –in the case of the repressor (I) and CAP in lac, and in the case of the repressor (cI) and cro in λ. In both instances there are dimers which help tether the proteins to the DNA and give them a higher chance to do their job. • In both instances there is regulated recruitment of the RNA polymerase to the sites of action: cI and CAP. • In both cases, the chief regulatory proteins (cI and I) can form tetramers which lead to looping of the DNA and the stabilization of the structures, through these larger, energetically favourable structures. • In both instances positive feedback loops ensure the stability of the states (the transacetylase in the case of lac and the synthesis of cI in the case of λ. • Both systems have evolved a way to sense the environment. The main difference is that in one case (λ), a programme is activated and in the other (Lac) it is the synthesis of an enzyme. • From a regulatory perspective, λ is two autoregulatory systems inhibiting each other. This type of regulation is very common in biological systems. It is robust and probably mediates choices between one state or another. 18
Lecture 3 Gene expression in eukaryotes: S. cerevisiae Gal4 as a guide for universal principles of transcriptional regulation The principles that we can extract from the analysis of the regulation of fate decisions in Lac and λ can be extrapolated to eukaryotes. However in these systems, there are additional layers of regulation that introduce novel elements into the mechanism. For example there is a nucleus which creates a distinct compartment in which transcription is separated from translation and thus creates an opportunity to regulate transcription in terms of the entry or exit of proteins from the nucleus. Most significantly the DNA is organized into nucleosomes around histones, giving rise to chromatin-­‐ which hampers the access and progress of RNA polymerase creating a ‘need’ for regulation of the structure of the chromatin in association with gene expression (see S. Jackson’s lectures. People call this level of regulation ‘epigenetics”. In addition, the basal transcriptional machinery is much more complicated and there is a large number of proteins associated with RNA polymerase (see again S Jackson’s lectures); this introduces many layers of potential regulation in the assembly of the machinery for and initiation of the process of transcription. And every one of these levels (nuclear access of specific factors, chromatin structure an assembly of basal transcriptional machinery) are used to regulate gene expression. Yeast provides a good system where to begin to see how molecular interactions mediate responses and create circuitry and, within yeast, the biochemical system utilizing Galactose has provide many insights into how the basic principles of prokaryotes are put to work in eukaryotes. Regulatory logic of the Gal system of S. cerevisiae As it is customary by now, we start with a survey of the genes required for the growth of yeast on galactose i.e. we look for mutations (loss of function, gain of function or temperature sensitive) that affect the ability of cells to do this. Then using epistasis and, by now, a little bit of molecular biology, one can work out a genetic circuit that functions in this process. This type of analysis reveals that the system can be divided into two core components, a collection of ‘structural genes’ which encode enzymes and molecules dedicated to galactose metabolism (centrally GAL1, GAL10, GAL7, GAL2) and regulatory genes whose function is to regulate the expression of the structural gene (GAL4, GAL80, GAL3 and GAL11). Gal2
Gal1
Gal10
Gal7
Although this is the core, modern methods have identified a large number of genes that are co-­‐ordinately regulated by the regulatory box, all of them associated in one way or another with the metabolism of Galactose. As we have discussed eukaryotes offer an ample number of possibilities for gene regulation, and this system exploits Gal4
Gal80
them all. The usual genetic analysis reveals a complex Gal3
Gal11
relationship between the regulatory genes highlighted in the figure, whose biochemical meaning unravels in the molecular analysis. A common theme of all the genes under the control of the regulatory tool box is a regulatory region present in all genes involved in galactose growth, this sequence is cis acting and is called UASGAL (Upstream Activating Sequence related to Galactose); it is similar to the CAP 19
binding site in E. coli Lac or the cI/cro Operator in λ and there are about 300 of these sequences in the genome. This means that there are many genes under coordinated regulation. So, you can already see the scale of the complexity of eukaryotic gene regulation. The main transcriptional regulator here is Gal4 (encoded by the GAL4 gene), which binds UASGAL and is, if you wish, the equivalent of CAP or cI in E. coli and λ. Following Ptashne, let us focus on the GAL1 gene which, for all practical purposes illustrates how the regulation of its expression works i.e. what is true for GAL1 will be true for other genes that have a UASGAL and which are under common regulation. A technical interlude: Reporter assays. A technical and important issue when dealing with eukaryotic transcription is the notion of reporter. In order to facilitate the readout of a potentially significant transcriptional interaction, one can use a bacterial gene e.g lacZ for which there are easy assays, or more recently Green or Red Fluorescent Protein, that glow. If one places genes encoding these proteins downstream of putative regulatory regions, one can use the proteins as ‘reporters’ for the transcriptional event. Bypassing any function that the gene under regulation would have and making the experiments easy. We shall see much of this in this lecture. A look at the regulatory region of GAL1 (defined 275bp
through DNA bashing and reporter analysis), identifies UAS GAL4
Mig1
an element located at about 275 bps from the GAL1
transcription start site that is necessary and sufficient and which contains four binding sites for Gal4 (UASGAL). Downstream from it there is a repressor binding site for Mig1 which, much as in the case of Lac, ensures that the Galactose regulatory network is only used when there is Galactose around and not when there are other sugars e.g glucose. We can already see here some of the features that distinguish eukaryotes from prokaryotes: • There are multiple UASGAL regulatory sites • There are larger distances from the regulatory sequences to the promoter: the UASGAL is located over 200bp from the initiation of transcription and contains multiple sites of 17bp each –which allow them to be unique-­‐, whereas in E. coli the regulatory sites tend to be around 40bp from the transcription start site. • The repressor, Mig1, does not work like the bacterial repressors by occluding the binding of RNA polymerase but by recruiting other proteins that affect the chromatin structure and the interactions of other regulatory proteins involved in the transcriptional activators. We shall not say too much about Mig1 however it is important to bear in mind that 1) It plays a role in the regulation of GAL1 activity and 2) it does not act like bacterial repressors. Mig1 is a way of ensuring that the Galactose genes are not expressed in the absence of galactose; a bit like the lac repressor but, as we have already said, it works not by occluding the DNA but by recruiting a complex of proteins that mediate repression. Mig1 also exploits the compartmentalization of the cell: in the absence of glucose it is in the cytoplasm and phosphorylated. This changes if there is glucose as then it enters the nucleus, binds its site and recruits the repressive complexes. These features are typical and general of eukaryotic regulatory regions. One feature in common with Lac is that the system is under negative regulation i.e. the expression of GAL1 is repressed until there is Galactose in the medium and the reason for this is that although 20
Gal4 is bound to the DNA (as a dimer), it cannot promote transcription, not only because Mig1 is working (see above) but, most significantly, because the product of the GAL80 gene is bound to a specific domain on Gal4 and does not allow the relevant interactions that promote transcriptional activation. When there is Galactose around, Gal80 is taken off Gal4 by Gal3, which is a cytoplasmic protein that senses Galactose, Mig1 is also shuttled out of the nucleus, and Gal4 is allowed to interact with the transcriptional machinery. Thus the system exploits the nuclear compartmentalization of eukaryotic cells for the purposes of regulation. Once Gal4 is free from Gal80 it can recruit and interact with members of the Mediator complex, which will bring in RNA polymerase to the site. The product of the Gal11 gene is an element of the Mediator complex that interacts with Gal4 and provides a bridge for the recruitment and activation of RNA polymerase. Thus one can see how the genetic relationships are, again, enlightened by the molecular biology and the biochemistry. Gal80
GAL1
Gal11
Gal4
MEDIATOR
Mig1
GAL1
RNA polII
GAL1
Now, what is interesting is to look into the way Gal4 promotes transcription as this has served as a reference for the way we think about the regulation of transcription in other eukaryotic systems (always bear in mind S. Jackson’s lectures). Gal4 Analysis of the Gal4 protein reveals the existence of three structural domains, each with a specific function, that act in a modular manner: a DNA binding domain, a domain involved in the dimerization and an “activation domain” which contains a Gal80 binding site. Modularity means that each of these domains will work on its own and, more significantly, together with other proteins that complement the domain function. Experiments testing this through measuring the activity of the different fragments of the protein reveal that the DNA binding domain and the activation domain are independent and work in heterologous systems. In a series of experiments, it was shown that GAL1 can be activated by Gal4 activation domain from a different DNA binding domain -­‐provided there is a regulatory region in the DNA; thus, one could take the DNA binding domain from cI or from I, attach it 21
to the dimerization and activation domain of Gal4 and this will lead to transcriptional activation in yeast. This is a most important result because it will explain much of the logic of transcriptional activation in eukaryotes and, particularly, in development. This experiment states that one can have large numbers of combinations between DNA binding domains and activation domains. The activating domain is an interesting beast. Intrigued by this function, Ptashne and his colleagues decided to do an experiment whereby they took a Gal4 DNA binding and dimerization domain and attached to it a large number of possible polypeptides. Then they tested their activity in a reporter assay. They did find that the activation domain of Gal4 could be substituted by a number of peptides most of which had an acidic nature. As long as they could reach the transcriptional machinery, their structure did not matter and they worked. This observation allows one further experiment, the reverse of the heterologous DNA binding domain which shows that the DNA binding domain provides the anchor for the activating domain and the number of links don’t matter. One can tie an activation domain to Gal80 and, under these circumstances, one can get Gal4 dependent transcriptional activation even in presence of glucose because the heterologous activation domain works. This experiment makes the point that the role of Gal80 is to impede the interaction of Gal4 with the basal transcriptional machinery. The jigsaw: DNA binding domains, interaction domains, activation domains. The machinery is general: Gal4 works in flies! What the work with Gal4 illustrates is a general principle of eukaryotic transcription factors and this is that they are modular, that they have three elements (DNA binding domain, interaction domain and activation or, in the DNA binding
case of repressors, repression domains) that Trans Activation
work independently of each other but that when they come together provide a unique code for a particular gene. There is a small and finite number of these elements and have names associated with their structures (e.g Zinc fingers (ZF), basic HelixLoopHelix (bHLH), Homeodomains (HD), Leucine Zippers (LZ)) and transcription factors emerge from combinations of these different domains which creates a very large repertoire. The activation or repression domains are interesting and important because they interact with the basal transcriptional machinery (S. Jackson’s lectures) and this is what determines the output of the interactions between the different transcription factors. As we have seen there is a certain generic nature to the activation or repression domains (though there are some rules) and they can work for any DNA binding domain. What is perhaps most interesting is that some of these activators and repressors work in different species. It is in this manner, through combinatorials, that the 400 or so transcription factors in eukaryotic cell can drive the expression of over 30,000 genes in a space and time controlled manner. We can mentioned that we can tether the Gal4 activation domain to a different DNA binding domain and get it to drive transcription. Probably not surprising, DNA is DNA. But what From Ptashne and Gann, Genes and signals
22
about the transcriptional machinery? Is it organism dependent? To test this, we take the Gal4 protein from S. cerevisiae and express it in Drosophila. If we provide a reporter for Gal4 by introducing in the chromosome a UASGAL upstream of, for example, ß-­‐galactosidase (lacZ) , we can get the yeast Gal4 to activate expression of lacZ in Drosophila using the transcriptional machinery of Drosophila. If Gal4 is expressed from some promoter which drives tissue or cell specific expression, Gal4 will elicit the expression of lacZ in the pattern of Gal4. This shows that the yeast Gal4 protein can interact with (talk to!) the basal transcriptional machinery of the fruit fly and therefore suggests that there is a universal molecular language for the process of transcription. Similar experiments can be done with mammalian cells and, in general, they work. Yeast meets Drosophila: Bicoid and Gal4, affinities We did say at the beginning that the discussions on prokaryotes and eukaryotes were to try to learn something about the way cells use transcription to make decisions during development. You will be hearing a lot about this later in the course but here I would like to introduce you to an interesting system which has taught us a lot about the relationship between gene regulation and development: the segmentation of the Drosophila embryo. An important molecule and trigger of this process is called Bicoid, a transcription factor with a DNA binding domain of the homeodomain (HD) kind (it has a HD in its N terminus and an Activation Domain in its C terminus (347-­‐414) –we know this through the kind of experiments that allowed us to dissect Gal4). This protein is expressed at the anterior pole of the Drosophila embryo and diffuses posteriorly to generate an exponential gradient which determines where the different parts and the different segments lie in a concentration dependent manner. If Bicoid is missing the larva that hatches from the embryo lacks the anterior segments and has the tail part disorganized and sometimes an extra posterior end at the missing anterior. We do know that Bicoid is a transcriptional activator because there is a target gene called hunchback which is activated at a certain Bicoid concentration and requires Bicoid. In fact there are Bicoid binding sites upstream of the transcription start of hunchback (hb) which can be shown to be necessary for expression; some are high affinity binding sites and some are low and it is the combination of the two that determined where a gene is expressed in the AP axis of the embryo. Extraction and multimerization of these sites in front of a reporter confirms that they behave as high and low affinity sites and, for the same gradient of Bicoid, they generate nested patterns of expression. We can also take the Bicoid binding sites and place several of them upstream of a GAL1:lacZ gene reporter in yeast, provide Bicoid in trans –in yeast-­‐ and observe how it activates the reporter. What we learn is that Bicoid can interact with the basal transcriptional machinery of yeast and work with it to promote transcription. This is remarkable but perhaps not surprising if, as we have seen, the Gal4 protein from yeast can work in Drosophila and interact with its transcriptional machinery. 23
Bicoid (bcd)
AD
HD
bcd
hb bs
GAL1 lacZ
o
o
Gal4 AD
Functional deletion analysis of the Bicoid protein, reveals that there is an activation domain in the C terminus of the protein and that without it, Bicoid cannot function: it does not activate transcription in yeast and the fly does not make a head. Interestingly we can substitute this activation domain for a similar one from other proteins e.g Gal4, and it will work. These experiments indicate that Bicoid works by activating transcription of downstream genes. A hybrid protein containing Bcd (DNA binding domain):Gal4 (activation domain) not only activates transcription efficiently in yeast, but it is also capable of rescuing a bcd mutant embryo. These experiments reveal the universality of the eukaryotic basal transcriptional machinery and the modularity of the system. This also emphasizes that this modularity can play important roles in the diversification. Organizing different cells in space through recruitment and affinities. An example from Drosophila, the eve 2 enhancers One of the significant problems in developmental biology is how a relatively small repertoire of transcription factors can generate a much larger number of cell types. We have seen that part of the solution lies in the combinations that arise from the modularity of transcription factors. However, two concepts that we seen earlier, ‘cooperativity’ and ‘regulated recruitment’, play an important role in the regulatory process: combinatorials, cooperativity and regulated recruitment work together to specify when and where a gene is expressed in an organism. One can see how effective these are by looking at another example of the early Drosophila embryo: the activation of stripes of expression of the gene even skipped –the name comes because the mutant lacks half of the segments in Drosophila and these happen to be the even numbered segments-­‐. In the Drosophila blastoderm, the even skipped (eve) gene is expressed in seven non overlapping stripes and molecular genetic analysis showed that each stripe has its own control panel, a collection of cis acting regulatory elements. The expression of the second stripe, eve2, has become a reference for our understanding of the spatial control of gene expression in eukaryotes. In the early Drosophila embryo the gradient of Bicoid (see above) not only leads to the expression of hb but it trickles down to other genes like hb (Kruppel (Kr), Giant (Gt), knirps (kni)) that codee 24
for transcription factors, generating a set of overlapping domains that span the length of the embryo. Interactions between Bicoid and these factors converge on a third tier of regulatory elements the regulatory regions of the even skipped gene. One of the stripes, number 2, has been studied in great detail and its regulation provides general features. The stripe emerges because eve is subject to a combination of positive and negative Bcd
proteins each with their own control element (see Fig and slides). Hb Gt
The eve2 control region is about 480bp long and contains high affinity binding sites for Bcd, Hb, Kr and Gt, multiple sites, which says that there must be cooperativity. Activation depends on cooperative eve 2
interactions between Bcd and Hb -­‐loss of function of Bcd and Hb or their binding sites in reporter assays abolishes expression of eve 2-­‐ the restriction of the domain relies on repressive interactions mediated by Kr and Gt –loss of function of Kr and Gt or their binding sites results in expanded stripe 2. The complexity of the regulatory region, does highlight the amount of information that the cell needs to create this simple pattern but also shows, at work, many of the features we have discussed in these (and S jackson’s) lectures. Kr
The interactions between the different regulatory elements and the eve2 regulatory region can be construed as a regulatory network in which Bicoid activates the expression of the activator (Hb) and collaborates with it. The core of such network is known as a feedforward loop and has interesting properties dynamical and regulatory. Summary: General principles for differential transcription? From the perspective of the regulation of gene expression, the main difference between E. coli, λ, yeast and Drosophila is the amount of information that the DNA has to process in each case. As complexity increases, so does the complexity of the regulatory regions and the proteins that read them. However, there are some basic strategies and principles used in all cases. The first and most important one is that weak interactions (protein:DNA interactions are generally weak) become strong through cooperative binding. Cooperativity can be used for regulated recruitment which thus provides a foundation for combinations of transcription factors at a regulatory site. This is largely allowed by the modular nature of their functional domains. Finally, repressors and activators work together to ensure that genes are expressed when and where they are needed. I is interesting how from a simple set of mechanis, about 400-­‐500 transcription factors can interact to regulate between 20-­‐
50,000 genes and generate over 1020 different cell types during development. One final and important comment is that the interactions between transcription factors and DNA serve to generate circuits that, as the networks they represent, provide useful information for gene regulation. It is the interactions between these networks that informs the life of E. coli, yeast and the development and life of higher eukaryotes. 
Download