Transcription: DNA -> RNA Gene Translation at ribosome mRNA Transcription Protein mRNA is the only type of RNA that is translated into protein DNA Transcription RNA 1)tRNA 2)rRNA 3)snRNA 4)mRNA Translation Protein ©1998 by Alberts, Bray, Johnson, Lewis, Raff, Roberts, Walter. Published by Garland Publishing, a member of the Taylor & Francis Group. The Population of mRNA Molecules in a Typical Mammalian Cell ---------------------------------------------------------------------------------------------------------Copies/cell#/class Total mRNA molecules/class Abundant 12,000 4 = 48,000 Intermediate 300 500 = 150,000 Scarce 15 11,000 = 165,000 ---------------------------------------------------------------------------------------------------------This division of mRNAs into just three discrete classes is somewhat arbitrary, and in many cells a more continuous spread in abundances is seen. However, a total of 10,000 to 20,000 different mRNA species is normally observed in each cell, most species being present at a low level (5 to 15 molecules per cell). Most of the total cytoplasmic RNA is rRNA, and only 3% to 5% is mRNA, a ratio consistent with the presence of about 10 ribosomes per mRNA molecule. This particular cell type contains a total of about 360,000 mRNA molecules in its cytoplasm. ---------------------------------------------------------------------------------------------------------- The key steps in transcription DNA – Initiation RNA – Elongation – Termination + Reaction mechanism of RNA polymerase 1. DNA-dependent RNA Polymerases 2. RNA Pols polymerize in the 5’ to 3’ direction (rNTP added only to the 3’ end) 3. 3’ OH of chain reacts with the a PO4 of incoming rNTP, liberating pyrophosphate 4. Added ribonucleotide follows Watson-Crick pairing rules, determined by template strand 5. RNA polymerases don’t need a primer, but do need ds DNA 6. RNA polymerase lacks exonuclease activities, then can not proof-read and is much more error prone than DNA polymerase. Schematic representation of the subunit structure of yeast nuclear RNA polymerases and comparison with E. coli RNA core polymerase. Subunit structure of purified nuclear RNA polymerases (nRNAP) • All 3 have 10-14 subunits. • Subunits range from 10 to 220 kDa. • All 3 have 2 very large (>125 kD) subunits and several smaller ones. • Several of the smaller subunits (5 in yeast) are common to all 3 Pol. Where is transcription initiated? • Promoters are sequences in the DNA just upstream of transcripts (coding sequences) that define the sites of initiation Promoter 5’ RNA 3’ • The role of the promoter is to attract RNA polymerase to the correct start site so transcription can be initiated S1 mapping of the 5’ end of a RNA Transcript A 5’ end labeled single-stranded DNA probe is prepared from the template strand. After hybridization to RNA and digestion with S1, the size of the protected probe tells approx. where transcription started. High resolution analysis of the 5’end of an RNA transcript by primer extension. Primer is an end-labeled DNA oligonucleotide (~20 nt) that is complementary to a sequence in the RNA ~150 nt from the expected 5’ end. Lane E- extended DNA product Lanes A,C, G, T – sequence ladder generated with the same oligo primer, but on the corresponding cloned DNA. Mapping DNA-Protein interactions Biochemical approaches to defining promoter sites • DNAse footprinting can be used to identify sites where RNA polymerase is in close contact with DNA. How we can generate this end labeled DNA? Sample of a DNAse I footprinting gel (for a DNA-binding protein). Footprint Lanes 2-4 had increasing amounts of the DNA-binding protein (lambda protein cII); lane 1 had none. Dimethylsulfate (DMS) Footprinting 1. End-label DNA fragment. 2. Bind protein. 3. Treat with dimethylsulfate, which methylates purine bases. 4. Partially cleave DNA by depurinating the methylated bases (piperidine) 5. Separate DNA fragments on DNA sequencing gels. Sample of DMS footprinting. Lanes 1 and 4 had no protein Lanes 2 and 3 had 2 different amounts of protein. Protein binding protects some purines from modification by DMS, it but can stimulate modification of others (helix distorted or partially melted). An electrophorectic mobility shift assay (EMSA) The principle of the assay is shown schematically in (A). In this example an extract of an antibody-producing cell line is mixed with a radioactive DNA fragment containing about 160 nucleotides of a regulatory DNA sequence from a gene encoding the light chain of the antibody made by the cell line. The effect of the proteins in the extract on the mobility of the DNA fragment is analyzed by polyacrylamide-gel electrophoresis followed by autoradiography. The free DNA fragments run rapidly to the bottom of the gel, while those fragments bound to proteins are retarded; the finding of six retarded bands suggests that the extract contains six different sequence-specific DNA-binding proteins (indicated as C1C6) that bind to this DNA sequence. (For simplicity, any DNA fragments with more than one protein bound have been omitted from the figure.) In (B) the extract was fractionated by a standard chromatographic technique (top), and each fraction was mixed with the radio-active DNA fragment, applied to one lane of a polyacrylamide gel, and analyzed as in (A). (B, modified from C. Scheidereit, A. Heguy, and R.G. Roeder, Cell 51:783-793, 1987. © Cell Press.) Navigation DNA affinity chromatography HOW WE CAN PURIFY A DNA BINDING PROTEIN? Bioinformatics approaches to defining promoter sites Promoter 5’ RNA 3’ • Comparison of known start sites to identify consensus sequences: TRASCRIPTION IN BACTERIA Promoter 5’ • • • • • RNA 3’ Regions of similarity are found around 10 and 35 bases before the start site of transcription: DNAse protection shows that RNA polymerase can bind to these same regions. Mutations of these sites can lead to the elimination or reduction of transcriptional initiation at a promoter. Differences in these sites control the relative rates of expression of different genes. Strong promoters have sites that are very similar to the consensus sequence while weak promoters show many differences Schematic diagram of the steps in the initiation of RNA synthesis (DNA How does sigma associate with a promoter? The s subun it appears to have two segments that contact the bases of DNA molecule transcription) bytheRNA through the major groov e. It does this catalyzed while it is associated with core enzyme. How does s promote binding of RNA polymerase polymerase. to promoters? 70 s70 lowers the general affinity of RNA polymerase for DNA. The enzyme– first forms closed complex in As a result,aRNA polymerase is able movestrands quickly along DNA which the two to DNA remain fully basescanning for promoter sites. paired. In the next step the enzyme catalyzes the • s70 can bind specifically to promoters (the opening of-10a and little-35more than one turn of the regions). This allows the holoenzyme to bind DNA helix –to form an open complex, which -35 -10in tightly to promoters when they are the template DNA strand is exposed for the encountered. initiation of anpolymerase RNA chain. The polymerase • RNA searches for promoter sites by moving along the DNA rather than containing the bound s subunit, however, by searching randomly throughout the behaves ascell. though it is tethered to the promoter • site: it seems unable to proceed with the elongation of the RNA chain and on its own frequently synthesizes and releases short RNA chains. As indicated, the conversion to an actively elongating polymerase requires the release of initiation factors (the sigma subunit in the case of the E. coli enzyme) and generally involves the binding of other proteins that serve as elongation factors. The elongation stage • s70 dissociates from the core RNA polymerase after initiation occurs. This yields: • In the absence of s70, RNA polymerase binds ssDNA tightly and is highly processive. 4. Termination Two types of termination events in E. coli – Rho independent – Rho dependent IR in DNA produces a stem-loop in RNA. Stem-loop formation competes with the RNA-DNA hybrid (Open Complex). Causing DNA helix to reform (Closed complex). Rho-dependent termination • Some mRNAs synthesized by RNA polymerase in vitro fail to terminate at the normal in vivo position. – This suggested that additional proteins might be required for termination at these sites. – The missing factor was identified and named rho. Rho in action Rho is a hexamer helicase. Rho binds transcripts at stretches of ~100 nt free of 2nd structure and rich in cytosines. Can unwind RNA-DNA hybrids. TRANSCRIPTION IN EUKARYOTES Studies of RNA synthesis by isolated nuclei • RNA synthesis by isolated nuclei indicated that there were at least 2 polymerases; one of which was in the nucleolus and synthesized rRNA – rRNA often has a higher G-C content than other RNAs; a G-C rich RNA fraction was preferentially synthesized with low ionic strength and Mg2+ – Another less G-C rich RNA fraction was preferentially synthesized at higher ionic strength with Mn2+ Separation and identification of the three eukaryotic RNA polymerases by column chromatography. A protein extract from the nuclei of cultured frog cells was passed through a DEAE Sephadex column to which charged proteins absorb differentially. Adsorbed proteins were eluted (black curve) with a solution of constantly increasing NaCl concentration. Fractions containing the eluted proteins were assayed for the ability to transcribe DNA (red curve) in the presence of the four ribonucleoside triphosphates. The synthesis of RNA by each fraction in the presence of 1 ug/ml of a-amanitin also was measured (blue curve). [See R. G. Roeder, 1974, J. Biol. Chem. 249:241.] Determining roles for each polymerase • Purified polymerases don’t transcribe DNA specifically – so used nuclear fractions. • Also useful were two transcription inhibitors 1. a-aminitin – from a mushroom, inhibits Pol II, and Pol III at higher concentrations. 2. Actinomycin D - general transcription inhibitor, binds DNA and intercalates into helix, prefers G-C rich regions (like rRNA genes). Drugs that inhibit RNA polymerases a-amanitin: actinomycin D: Drug sensitivities RNA Polymerase III I: II Synthesis of small 1. Actinomycin Not inhibited D, byat aminitin, low abundant RNAs but concentrations, inhibited by low did inhibited only at high concentrations not inhibit synthesis of [a-aminitin] Small actinomycin of heterogenous RNAs: tRNA D. nuclear RNA (hn 2. RNA precursors, produced 5S in rRNA, the RNA). U6presence (involvedofinasplicing), and 7SLbe aminitin could 2. a-aminitin inhibited RNA (involved competed synthesis by ofinhnRNA rRNA protein secretion for in hybridization nucleoplasmic to through the ER, part (rat) fraction. DNA. of the signal Conclusion: recognition Pol particle). I II synthesizes synthesizes hnRNA rRNA Conclusion: Pol III the synthesizes precursor (mostly mRNA many (45S preof therRNA smallī abundant precursors). 28S + 18S + cytoplasmic and 5.8S rRNAs) nuclear RNAs a-amanitin: Pol II: K0.5 = 0.02 ug/ml Pol III: K0.5 = 0.20 ug/ml Po1 I: insensitive actinomycin D: Pol I most sensitive, but all three Pol's inhibited at higher concentrations HOW WE CAN MEASURE TRANSCRIPTIONAL ACTIVITY IN VIVO? Nascent-chain (run-on) assay for transcription rate of a gene. Isolated nuclei are incubated with 32P-labeled ribonucleoside triphosphates for a brief period. During this period RNA polymerase molecules that were transcribing a gene when the nuclei were isolated add 300 – 500 nucleotides to nascent RNA chains. Very little new initiation occurs. By hybridizing the labeled RNA to the cloned DNA for a specific gene (A in this case), the fraction of total RNA produced from that gene (i.e., its relative transcription rate) can be measured. [See J. Weber et al., 1977, Cell 10:611.] In vivo assay for transcription factor activity. The assay system requires two plasmids. One plasmid contains the gene encoding the putative transcription factor (X protein). The second plasmid contains a reporter gene and one or more binding sites for X protein. Both plasmids are simultaneously introduced into host cells that lack the gene encoding X protein and the reporter gene. The production of reporter-gene RNA transcripts is measured; alternatively, the activity of the encoded protein can be assayed. If reporter-gene transcription is greater in the presence of the X-encoding plasmid, then the protein is an activator; if transcription is less, then it is a repressor. By use of plasmids encoding a mutated or rearranged transcription factor, important domains of the protein can be identified. Cis-acting DNA sequences can be identified by mutational analysis. Use of linker scanning mutations to identify transcription-control elements General pattern of cis-acting control elements that regulate gene expression in yeast and metazoans (a) Genes of multicellular organisms contain both promoter-proximal elements and enhancers as well as a TATA box or other promoter element. The latter positions RNA polymerase II to initiate transcription at the start site and influences the rate of transcription. Enhancers may be either upstream or downstream and as far away as 50 kb from the transcription start site. In some cases, promoter-proximal elements occur downstream from the start site as well. (b) Most yeast genes contain only one regulatory region, called an upstream activating sequence (UAS), and a TATA box, which is ≈90 base pairs upstream from the start site. Pol II basic promoter elements - 10 0 - 5 0 +1 50 100 core pro mot e r TA TA - b ox In r CAGAGCAT3A0TAAGGTGAG TAGGATCA ACCTT -G 20 -G 1T 0TGCTCCTC +1 Defines where transcription starts. Also required for efficient transcription for some promoters. Some class II promoters don’t have a TATA box. Transcription starts at a purine ~25-30 bp from the TATA box. The arrows indicate transcription start sites as determined by S1 mapping and primer extension. Normal promoter. SV40 early promoter analyzed in vivo. TATA box also important for transcription efficiency for some promoters. Rabbit globin promoter, tested in Hela cells, and assayed by S1 mapping of transcript 5’ end. How the different base pairs in DNA can be recognized from their edges without the need to open the double helix The four possible configurations of base pairs are shown, with hydrogen bond donors indicated in blue, hydrogen bond acceptors in red, and hydrogen bonds themselves as a series of short parallel redlines. Methyl groups, which form hydrophobic protuberances, are shown in yellow, and hydrogen atoms that are attached to carbons, and are therefore unavailable for hydrogen bonding, are white. Figure and text modified from Alberts et al., Molecular Biology of the Cell (1994). A DNA recognition code. The edge of each base pair, seen here looking directly at the major or minor groove, contains a distinctive pattern of hydrogen bond donors, hydrogen bond acceptors, and methyl groups. From the major groove, each of the four base-pair configurations projects a unique pattern of features. From the minor groove, however, the patterns are similar for G-C and C-G as well as for A-T and T-A. The binding of a gene regulatory protein to the major groove of DNA. Only a single type of contact is shown. Typically, the protein-DNA interface would consist of 10 to 20 such contacts, involving different amino acids, each contributing to the binding energy of the proteinDNA interaction. Figure and text modified from Alberts et al., Molecular Biology of the Cell (1994). The DNA-binding helix-turn-helix motif. The motif is shown in (A), where each white circle denotes the central carbon of an amino acid. The carboxylterminal a-helix(red) is called the recognition helix because it participates in sequence-specific recognition of DNA. As shown in (B), this helix fits into the major groove of DNA, where it contacts the edges of the base pairs. Figure and text modified from Alberts et al., Molecular Biology of the Cell (1994). Some helix-turn-helix DNA-binding proteins All of the proteins bind DNA as dimers in which the two copies of the recognition helix (red cylinder) are separated by exactly one turn of the DNA helix (3.4 nm). The second helix of the helix-turn-helix motif is colored blue. The lambda repressor and cro proteins control bacteriophage lambda gene expression, and the tryptophan repressor and the catabolite activator protein (CAP) control the expression of sets of E. coli genes. Figure and text modified from Alberts et al., Molecular Biology of the Cell (1994). Zinc Finger Protein This protein belongs to the Cys-Cys-His-His family of zinc finger proteins, named after the amino acids that grasp the zinc. This zinc finger is from a frog protein of unknown function. (A) Schematic drawing of the amino acid sequence of the zinc finger. (B) The three-dimensional structure of the zinc finger is constructed from an antiparallel b-sheet (amino acids 1 to 10) followed by an a-helix (amino acids 12 to 24). The four amino acids that bind the zinc (Cys 3, Cys 6, His 19, and His 23) hold one end of the ahelix firmly to one end of the b-sheet. (Adapted from M.S. Lee et al., Science 245:635-637, 1989. © 1989 the AAAS.) Figure and text modified from Alberts et al., Molecular Biology of the Cell (1994). DNA binding by a zinc finger protein (A) The structure of a fragment of a mouse gene regulatory protein bound to a specific DNA site. This protein recognizes DNA using three zinc fingers of the Cys-Cys-His-His type arranged as direct repeats. (B) The three fingers have similar amino acid sequences and contact the DNA in similar ways. In both (A) and (B) the zinc atom in each finger is represented by a small sphere. (Adapted from N. Pavletich and C. Pabo, Science252:810-817, 1991. © 1991 the AAAS.). Figure and text modified from Alberts et al., Molecular Biology of the Cell (1994). Required reading: Blau et al., Mol Cell Biol 1996, 16 (5): 204 Three functional Classes of Transcriptional Activation Domains