Advance Molecular Biology (LS642100, 2003) Text: Lewin, B. 2000 Genes VII. Oxford, New York Wednesday 10-11 am Friday 10-12 am Feb 19, 21 Prokaryotic transcription Feb 26 Chromosomes Mar 5 Nucleosomes Mar 7 Initiation of transcription Mar 12, 14 Regulation of transcription Part I Chap 9-11 Chap 18 Chap 19 Chap 20 Chap 21 T. Y. Lin Part I-1 Part I-2 Part I-2 Part I-3 Part I-4 Part I -1-1 Mechanism of Transcription in Prokaryotes The DNA-dependent RNA polymerases have the following properties: 1. template dependent, requiring double-stranded DNA 2. do not require a primer; synthesis begins with a nucleoside triphosphate 3. require the four nucleoside triphosphates (ATP, GTP, CTP, and UTP) 4. copy (read) the template DNA strand in the 3' to 5' direction 5. synthesize the RNA in the 5' to 3' direction In the prokaryotic world only one enzyme is necessary to carry out all of the transcription of RNA from DNA, catalyzing the synthesis of messenger RNA, ribosomal RNA, and transfer RNA. The E. coli RNA polymerase has multiple subunits: 1 Sigma factor is a specificity factor. The holoenzyme, because of sigma, is able to initiate RNA synthesis at specific locations in DNA called promoters. holoenzyme binds tightly to promoters in DNA. The Prokaryotic Promoters RNA polymerase holoenzyme interacts with DNA, it "searches" for (that is, it diffuses until it finds) a promoter. The enzyme binds to the promoter to form first a loose association (the closed promoter complex) and then a much tighter association (the open promoter complex) in which the helix has melted in preparation for copying. The –10 sequence is also known as the Pribnow box, in honor of its discoverer. 2 The structure of an actual promoter region from an E. coli gene that encodes a ribosomal RNA, rrnB The core region in this promoter contains –10 and –35 sequences. In addition, this promoter has an UP element that makes if an even stronger promoter (binds RNA polymerase more tightly). The elements shown in yellow are not part of the promoter because they don't interact directly with RNA polymerase. The Fis sites are enhancer sequences. The Initiation of Transcription and the Role of Sigma The event of holoenzyme binding and formation of the open promoter complex is the first step in transcription: initiation. The initiation of transcription depends uniquely on the presence of the sigma subunit in the holoenzyme. In fact, the entire enzyme complex makes very specific contacts 3 with the promoter region of DNA. Several experiments lead to this conclusion. Figure 6.27 summarizes a number of lines of evidence that indicates contact points between a promoter and RNA polymerase. Look at the legend to this figure in the book. The symbols and colors indicate what kinds of evidence was used. I'll try to interpret this for you here. The bases that are circled (in yellow) are positions that, when polymerase is bound, are protected from modification by a methylating agent (dimethly sulfate, DMS). The action of DMS is shown in Figure 6.18a. This can be interpreted as meaning the polymerase is in close contact with these regions of the DNA. In contrast, bases with a carat (^) above or below them indicate positions that are actually exposed to methylation by the binding of RNA polymerase (again, see Figure 6.18 for the reaction). This means that these bases are in regions of the DNA that is melted (opened) by the binding. You can also interfere with polymerase binding by modifying the DNA. Methlyation of certain bases (the ones with the small blue dots above or below them) or ethylation of certain phosphates in the backbone (the red arrowheads) can do this. Again, this is an indication that these positions are critical to RNA polymerase binding. Finally, this figure indicates the area of the promoter that is unwound by the binding of polymerase (notice the open regions at the right, between the Pribnow box and the +1 position). Structure of Sigma Factor Sigma factor itself is a protein that varies in different bacterial species from a 70 kDa protein (E. coli) to around 30 kDa in B. subtilis. The book goes into considerable detail in Chapter 6 about the fine structure of this protein. I only want to emphasize the role of the protein in initiation and its binding contacts with the promoter. Figure 6.31 shows the contact points between sigma and a typical prokaryotic 4 promoter (–10 and –35): The regions (4.2 and 2.4) refer to the conserved areas of sigma that are summarized in Figure 6.30. Notice that when you're reading this part of the book, the orientation of the protein in the two figures is opposite (amino to carboxy in Fig. 6.30, carboxy to amino in Fig. 6.31). The arrows going from the protein to the DNA have next to them the specific amino acids in the sigma polypeptide that interact with particular bases. For instance arginine 588 (R588) in region 4.2 interacts with the G in the –35 region, while arginine 96 (R96) interacts with the T in the Pribnow box. [To see all of the one letter designations for amino acids, click HERE] An experiment is described in the text, again using the filter binding assay (Figure 5.30) to detect protein/DNA interaction. Figure 6.32 shows the results of this experiment: The experiment measures the retention on a nitrocellulose filter of radiolabeled DNA complexed with a fragment of sigma. The sigma fragment contains only the 4.2 region (not the 2.4 region). The DNA contains a promoter element. The tac prommoter is an artificial construct with the –10 region of the lac operon promoter and the –35 region of the trp operon promoter. The two experiments shown in the figure are designed to see what kind of DNA can compete with this sigma fragment for binding to this piece of DNA. In Figure 6.32a (the left panel), we see that pTac DNA competes very well (notice 5 how steeply the red curve goes down with increasing amounts of unlabeled tac promoter) while DNA from which the entire promoter has been deleted (DP) does not compete very well (the blue curve). In Figure 6.32b (the right panel), the experiment focuses on the specific role of the two DNA regions. We see that when the competing DNA is pTac with the –10 region removed (D10) it works as well as pTac itself (compare the red curves in the two panels). However, when the competing DNA is pTac with the –35 region removed (D35) it works no better than the DNA with the entire promoter removed (compare the blue curves in the two panels). This is a nice experiment to demonstrate the specificity of binding of a particular region of protein to a particular region of DNA. Transcription: Elongation and Termination The RNA polymerase moves along the DNA template, adding nucleotides to the growing RNA chain in accordance with the sequence of the template or coding strand. Figure 6.47 in your text has this summary of the elongation process displayed graphically. Unfortunately there is an error in the figure that I must correct. Remember that all RNA polymerization reads the template strand 3' to 5' and synthesizes a the RNA strand 5' to 3'. The figure does not have the ends labeled, so it's a little hard to tell what's happening. Here's my corrected version of the same figure. Notice that I have included the ends of the respective DNA and RNA molecules. The process of elongation does not require the sigma factor. Sigma dissociates from the enzyme shortly after elongation begins (after about 10 nucleotides have been polymerized). Elongation does, however, require the other subunits of the RNA polymerase. In our table of properties of the various subunits, we saw that a, b, and b' each had distinct roles to play: 6 To show that the beta subunit is the one responsible for DNA binding, another kind of protein affinity experiment can be done. In this one, the investigators measured the ability of the protein to follow the DNA in an electrophoretic separation. They showed (figure 6.43) that it was only the beta-prime subunit that followed the DNA (i.e., was always found in the same place as the DNA after electrophoresis). The polymerase moves down the DNA double strand and produces the transcript. The topology of this motion is quite interesting and there is no general agreement about how it happens. Two models are presented in Figure 6.48: 7 In the first model, in which the polymerase revolves around the DNA, notice that the RNA strand that's wrapped around the DNA, not to mention the energy require to spin the enzyme. This has no experimental support, yet it provides a model in which any distortion of the helix is not a problem. The second model has the enzyme moving down the helix without revolving. This requires that the helix opens in front (melts) and rewinds behind (reanneals). This might introduce topological strain, depending on how many base pairs are opened. If it's more that a single turn of the helix (10 bp) then strain is introduced. This would require the action of strain-relieving enzymes (topoisomerases) so that transcription can proceed. Experimental evidence in support of this model does exist, using topoisomerase mutants. We have discussed topoisomerases already in the course when we talked about DNA replication. Termination of Transcription There are two ways in which transcription in the prokaryote is specifically terminated: rho-independent termination and rho-dependent termination. Rho is a protein factor that we shall see has a role in the second type of termination. We will discuss them in this order. Rho-independent termination Figure 6.49 describes how rho-independent termination is thought to occur. 8 The polymerase transcribes a region of the DNA the RNA polymerase to pause immediately after the loop forms. This means that the only thing holding the RNA to the template the the AU-rich sequence immediately downstream. AU base pairs have a very low melting temperature, at or near room temperature. Therefore, the RNA leaves the template at this point and the RNA polymerase falls off, resulting in termination. We will see a classic example of this when we look at attenuation in the tryptophan operon in the next module. Rho-dependent termination Again, in this model the RNA stalls at a stem-loop structure. This time, however, the termination is caused by the action of an ATP utilizing protein, rho factor. Rho is a 60 kD protein that forms a hexamer. This complex binds to the RNA at a specific site and, using the energy of ATP hydrolysis, moves along the RNA, acting somewhat like a helicase. Rho unwinds the RNA from the template. The figure on the left is Figure 6.54 from your text. On the right is my version of this, which does not show the stem-loop, but does have the size of rho indicated. 9 Prokaryotic Transcription: Phage Strategies (Chapter 11) http://www.blc.arizona.edu/marty/411/Modules/mod10.html The infection of a prokaryotic cell by a virus often alters the cell's ability to transcribe mRNAs. Viruses of bacterial cells are called "bacteriophage" or simply "phage." Three kinds of viral alterations that are produced by infection of a host cell with a bacteriophage. Phage-Specific Alteration 10 Example Phage encodes new sigma factors B. subtilis phage SPO1, coliphage T4 Phage encodes new RNA polymerase Coliphage T7 Phage encodes repressors and modifiers of transcription termination. Coliphage lambda The general structure of two of these bacteriophage are shown in the figures below: Phage T4 infecting its host E. coli cell Drawing of phage lambda Phage SPO1: Alteration of the Host RNA Polymerase After a host cell is infected the phage SPO1, a series of alterations occur that change the transcription specificity of the host RNA polymerase. There are three stages to viral gene expression that can be observed. Each of these uses a different sigma factor in conjunction with the host core enzyme. Early transcription takes place for the first 5 minutes of the infection. Middle transcription occupies the time from 5 to 10 minutes after infection. Late transcription can be seen from 10 minutes to the end of the infectious cycle. The early transcription of SPO1 genes is carried out by the host holoenzyme, with the normal sigma factor 11 This transcription takes place from the time of infection until about 5 minutes after infection. A number of early genes are transcribed, among which is the gene for the early protein gp28. The viral protein, gp28’s job is to take over the cell. It does this by displacing the normal sigma factor from the host RNA polymerase. This new "holoenzyme" now transcribes exclusively the phage middle phase proteins and will no longer transcribe host proteins. It likely does this because the phage has promoters with different sequences that are only recognized by gp28. Among the products of the genes transcribed during this middle period of the infection are the proteins gp33 and gp34. These proteins constitute a second phage sigma factor. They now alter the polymerase so that it can only transcribe the phage late genes and no host genes. 12 The sum of these changes is that the cell is converted to a factory for producing new virus particles. Notice that no host proteins can be produced after the switch from early to middle transcription. The sigma factors involved in SPO1 transcription: Protein Size Effect on Transcription of: Host Phage sigma 43 kDa transcribed normally directs early transcription gp28 26 kDa not transcribed directs middle transcription only gp33 + gp34 13 kDa + 24 kDa not transcribed directs late transcription only Phage T7: Encoding a Novel RNA Polymerase Bacteriophage T7 solves the transcription problem in a more direct manner. of modifying the host polymerase, it simply supplies its own! Instead The transcription of phage T7 is divided into three classes: I, II, and III. The first of these is the earliest in time and occurs from the beginning of infection up to about 3 minutes. One member of this class of genes is called gene 1. Gene 1 is a RNA polymerase. The phage RNA polymerase has a very different specificity that the host enzyme. First, it is a single subunit rather than four. It does not have a sigma-like factor associated with it. In addition, it recognizes a different kind of promoter sequence. 13 The transcription of class II and class III genes by T7 RNA polymerase is very different from that the activity of the host enzyme. The recognition sequences (promoters) of the host and T7 enzymes are shown below: The T7 enzyme transcribes the class II and class III genes beginning from sequences that would not be recognized as promoters by the host enzyme. In this manner, the virus accomplishes its task of taking over the host cell and converting it to a factory for producing progeny virus. Both of the bacteriophage SPO1 and T7 have infectious cycles that wind up with the lysis (rupture) of the host cell and the release of the newly made virus particles. They are therefore called lytic bacteriophage. There is another kind of infectious cycle, typified by bacteriophage lambda. These kind of phages are called temperate. Bacteriophage Lambda: A Complex Viral Operon Bacteriophage lambda is the prototype of a group of phages that have a very interesting "lifestyle." On the one hand, they can infect a cell and, just like SPO1 or 14 T7, redirect the cell to become a factory for the production of new virus particles, resulting in the lysis of the cell. On the other hand, lambda can infect a cell, direct the integration of its genome into the DNA of the host, and reside there, replicating as a part of the chromosome, until such time as it activates and produces new virus particles and lyses the cell. The first of these is called the lytic cycle, the second is the lysogenic cycle. Phage such as lambda that have these two possibilities are called temperate phage. Phage gene expression during these two cycles uses the host RNA polymerase. But, rather than expressing new sigma factors (as SPO1) or an entirely new polymerase (as T7), lambda uses operator-controlled promoters (operon type control) along with a type of regulation of transcription, antitermination. The table summarizes the regulatory events during the lambda infectious cycle: Event Lambda Gene Expressed Comment Initial infection cro, N Only N and cro are synthesized until the decision point is reached Lytic pathway cro, N, Q, late genes cro predominates at operators, N and Q are antiterminators Lysogenic pathway cII and cIII collaborate to establish cI cI, cII, cIII, int synthesis; after genome integration, only cI is expressed during maintenance of lysogeny. The lambda genome presented in a linear area (the way the DNA exists in the intact phage) and as a circle (the way the DNA forms after it enters the host cell): 15 Lambda: The Initial Infection and the Lytic Pathway Immediately after lambda DNA enters the host cell, the host RNA polymerase begins to transcribe the lambda genome from two promoters: the PL (leftward) promoter and the PR (rightward) promoter. These two transcripts encode two proteins; cro (the abbreviation is a lambda jargon for "control of repressor and other things) and N. The phase is called immediate early transcription. If the "decision" is made that this will be a lytic pathway then the delayed early genes are made next. As the lytic infection proceeds, all of the phage late genes are expressed, including all of the proteins necessary to form the phage head and tail. 16 In order to switch from immediate early to delayed early, there is no change in the sigma factor or in the polymerase itself. Instead, lambda causes antitermination to occur at both rho-dependent and rho-independent termination sites. The antitermination is caused by two proteins, N and Q. The effect of N on transcription from one of the two promoters makes the immediate early mRNAs. In the presence of N, the transcription that was terminating at the site indicated in red, now continues through, producing a polycistronic mRNA. The same thing happens for transcription from the rightward promoter, PR. As a result, a number of gene products are made, not just N and cro. One of these is Q, another antiterminator that eventually opens up full transcription of all the late genes. N acts in conjunction with four host cell proteins and the RNA polymerase. There is a site in the transcript RNA called the N-utilization site (nut site) to which N binds. 17 Once bound to the region called Box B (a region of stem-loop structure), N can interact the polymerase through a protein called NusA (originally called the N-utilization substance A). Finally three other host proteins (NusB, NusG, and S10) bind at Box A of the nut site. One model for the binding of N is that it prevents the pause of the polymerase that characterizes termination events. The stem-loop structure of the nut site Box B is some distance from the actual stem-loop at the termination site. The other antiterminator, Q, acts in a very different manner. Lysogeny: Establishment and Maintenance The lambda repressor is called cI ("c-one"). The origin of the names for the three genes originally identified as involved in lysogeny (cI, cII, and cIII) derives from the fact that mutants in these genes produced phage plaques on lawns of susceptible cells that were clear rather than the normal cloudy. (Plaques are clear areas where the virus has lysed the host cells). cI is a typical repressor protein, its function is to bind to operator sequences and prevent transcription. During the initial steps that establish lysogeny, cI must be made. The promoter for cI mRNA transcription is called PRE (promoter for repressor establishment). 18 This promoter is unusual and the host RNA polymerase will not transcribe it without help. cII phage protein causes the polymerase to bind to PRE and transcribe mRNA for cI. cI mRNA is transcribed in the leftward direction from PRE. This means that the mRNA contains sequences that are complementary to cro mRNA (so-called "antisense" RNA). cII itself must be protected from proteolytic degradation by a host enzyme called HflA. This protection is afforded by the phage protein cIII. When everything is right, cI is made and lysogeny is established. During this time the phage genome integrates into the host chromosome, catalyzed by a phage enzyme called integrase. In the lysogenic state, the phage genome is maintained in a quiescent state, expressing only one mRNA, the mRNA for cI, the repressor. Transcription of this mRNA now occurs from the promoter PRM (for "receptor maintenance"). The lambda repressor binds to DNA as a dimer. There are two operator sites, OR and OL. Binding at OR and OL sites cuts of transcription of N and cro. A closer look at OR reveals that the operator region can be subdivided into three operators, OR1, OR2, and OR3. In fact, cI binds to both OR1 and OR2, and binding to OR2 actually stimulates the transcription of the cI mRNA. In this sense, cI acts as both a positive and negative regulator of gene expression. cI negatively regulates cro and it positively regulates its own transcription. 19 cI binds to the three operator regions in a order of preference. It binds most tightly to OR1, next to OR2, and lastly to OR3. This order of binding will be critical to understanding the decision between lytic and lysogenic pathways. Once lysogeny is established, the lambda DNA has integrated into the chromosome, the only viral protein being expressed is cI, and the viral DNA (now called a prophage) replicates along with its host. In fact, if a cell in this state (called a lambda lysogen) is subsequently infected by another lambda, nothing will happen. The second virus (called the super-infecting virus) is unable to make anything but the cI protein. Lambda Wars: A Battle Between cI and Cro Determines Lysogenic or Lytic Pathways During the earliest stage of infection with lambda, when the "decision point" is being reached (a reaction to the conditions), a competition takes place between cro and cI cro is made right away, so are the two proteins cii (from PR) and ciii (from PL) after antitermination by N takes place. So, early on, both cro and cI may be present at some level. What happens? The answer seems to be found at the OR operator. Both cI and cro bind to this OR operator. However, they bind differently. The binding of cro and cI to this region differs in terms of the order of interaction. The order is as follows: cI: OR1, then OR2, then OR3 cro: OR3, then OR2, then OR1 20 If cI happens to get to its preferred binding sites first, it captures the operator, prevents cro synthesis and stimulates its own synthesis. However, if cro gets to its preferred site first, it captures the operator and prevents synthesis of cI (it does not stimulate its own synthesis, however). What can lead to different levels of cro and cI? One answer is the environmental state of the cell. Two examples will help understand this. First, when the cell undergoes severe DNA damage from radiation (say, ultraviolet light), a repairs system is triggered called the SOS response. The SOS response activates a protease called the recA protease. This enzyme can destroy the lambda repressor cI. This means that cro gets a chance to win. Therefore under this condition, the phage goes into the lytic pathway, even if it has been in the lysogenic state. We say that UV induces the phage to the lytic pathway. A second example involves the nutritional state of the cell. Remember from above that the cII protein is required for the host RNA polymerase to transcribe the cI mRNA from the promoter PRE. Lambda protein cIII protects cII from destruction by the cellular protease, HflA. This protease is sensitive to the glucose level in the environment of the cell. Low glucose results in high cAMP. The HflA protease is activated by high camp and is able to overcome the inhibition by cIII. Therefore, when the cell is in a poor nutritional state (signaled by low glucose) the lytic pathway is favored over the lysogenic. 21 These two examples show that lambda has evolved to make the lytic vs. lysogenic "decision" based upon the fitness of the cell as a long term host. If the cell is healthy (e.g., no DNA damage or in good nutritional state) the virus can make a home for its genome there (lysogeny). When the state of the cell does not look good for the long term (UV damage or poor growth medium) the virus replicates quickly, making progeny and lysing the cell. 22