Eukaryotic DNA Polymerases The DNA polymerases of eukaryotes are in general less understood than the DNA polymerases of prokaryotes, but recently much has been learned about their functions and activities. DNA polymerases from yeast and mammalian cells (especially mouse and human) are among the best studied. Eukaryotic cells have at least four major nuclear DNA polymerases: α , β, δ, and ε . Polymerase γ is found in mitochondria, although it is encoded by a nuclear gene. Plant chloroplasts also contain their own DNA polymerase that appears to be similar to γ . The six polymerases mentioned so far are involved in DNA replication, DNA repair, or both. These "classical" pols can be characterized as accurate. Most of these have been known for some time; for example pol α was discovered in the late 1950's, and pols δ and ε in the 1980's. However, since 1999, at least 10 new cellular pols have been discovered. These pols are mostly innacurate. They are not involved in chromosomal replication, but in other essential processes such as inaccurate translesion repair and somatic hypermutation of immunoglobulin alleles. We will discuss these later, when we cover repair processes. Here we will focus on the classical, nuclear DNA polymerases, as well as some interesting viral pols. In addition, because of complexity of cellular genome and cellular DNA replication, viral model systems have proved invaluable. Among the most useful of these is simian virus 40 (SV40). SV40 is a small mammalian virus with a circular duplex genome of about 5 kbp. It does not encode a polymerase and thus uses host DNA polymerases to replicate its genome. Both cell-free systems (cell extracts plus viral DNA templates) and in vitro, reconstituted replication systems using viral DNA with purified cellular proteins have been studied. This has allowed the identification of DNA polymerases and accessory factors required for replicative DNA synthesis in host cells. Other important viral model systems include the E. coli phages T4 and RB69, and the mammalian viruses herpes simplex virus (HSV) and adenovirus. These viruses encode their own pols and accessory replication proteins. 32 In addition to the human system and various viral models, important model organisms for cellular DNA synthesis include Saccharomyces cerevisiae (budding yeast) Schizosaccharomyces pombe (fission yeast), Xenopus laevis (African clawed toad), and mouse. (Reviews: Wang, T.S.-F. (1996) Cellular DNA polymerases. In: “DNA Replication in Eukaryotic Cells” (M. DePamphilis, ed.) Cold Spring Harbor Laboratory Press; Hubscher, U., Maga, G., and Spadari, S. (2002) Eukaryotic DNA polymerases. Ann. Rev. Biochem. 71:133163.) DNA polymerase α A wealth of genetic and biochemical evidence indicates that this polymerase is required for chromosomal replication; for example: • ts Pol α mutants do not replicate DNA at restrictive (or non-permissive) temperature (cultured mouse cells and yeast). In yeast, Pol α function is required for cell viability and nuclear DNA replication; mutants are blocked at S-phase in the cell cycle at non-permissive temperature. (Fig. 28). • Pol α transcript, protein, and enzyme activity is present in cycling cells, but is low or undetectable in differentiated, quiescent cells (G0) in all systems examined. • Depletion of Pol α by antibody precipitation blocks SV40 DNA replication in HeLa (human) cell extracts. • Pol α is an essential component of the reconstituted in vitro SV40 replication system. Pol α polymerase and primase activities This unique enzyme has two distinct polymerase activities: a 5’→ 3’ DNA-dependent DNA polymerase, and a 5’→ 3’ DNA-dependent RNA polymerase. The RNA polymerase activity is a primase. Because of this, the enzyme is often referred to as Pol α :primase. It is the only enzyme known to have both DNA polymerase and primase activities, and the only one capable of selfprimed DNA synthesis on a previously unprimed ssDNA. Pol α :primase is a heterotetramer: 180 kDa – DNA pol catalytic subunit 68 kDa –- structural; protein-protein interactions 48 kDa -- primase catalytic subunit 55 kDa – primase subunit; bridge with DNA pol (protein subunit sizes from human cells) 33 The primase activity is not very processive. It incorporates rNMPs to synthesize short RNA primers, about 8-12 nt long, that are complementary to an ssDNA template. These oligoribonucleotides are called initiator RNA (iRNA). The substrates for the primase reaction are ssDNA (unprimed) and rNTPs. The primase will use dNTPs as substrate if rNTPs are not available, but rNTPs are preferred. The DNA polymerase activity is also not very processive. Typically, Pol α adds about 30 dNMPs to the 3' end of the iRNA primer. (The DNA is sometimes called initiator DNA, or iDNA). The substrates for the DNA polymerase reaction are a template primer (iRNA is the primer) and dNTPs. The final product of both activities is an RNA/DNA primer, with an average length of ~40 nt, with the structure pppRNAn -p-DNAn . Pol α does not have an intrinsic 3'→ 5' exonuclease activity and is not known to associate with one (no apparent editing activity). It also lacks a 5' → 3' exonuclease activity. (Eukaryotic polymerases in general lack this activity.) Physiological role of Pol α :primase As indicated previously, the Pol α polymerase activity is known to be essential for replicative DNA synthesis. This also applies to the primase activity. In yeast, genetic analysis indicates that the primase subunit is required for cell viability and nuclear DNA replication. In vivo, the primary function of Pol α :primase is to make short RNA/DNA primers for replicative DNA synthesis. It is probably not involved in significant primer elongation. Accordingly, its DNA pol activity is not a very processive and it does not associate with a clamp. Its inability to edit may not a liability because both the RNA and DNA portions of the primers are mostly removed and replaced during chromosomal replication (discussed later). Why are RNA/DNA primers used in eukaryotes? (Recall that prokaryotes use primers that consist of RNA only, synthesized by DnaG primase.) Eukaryotic pols can generally use DNA or RNA primers. But eukaryotic clamp loaders generally prefer DNA primers. DNA polymerase δ Pol δ is another multisubunit nuclear enzyme. In Homo sapiens, it is composed of 4 subunits: 125 kDa-- pol catalytic subunit 66 kDa-- interaction with processivity factor; multimerization (?) 50 kDa-- structural; protein-protein interactions 34 12 kDa-- structural; protein-protein interactions The catalytic subunit has 5’→ 3’ DNA polymerase activity and an intrinsic 3'→ 5' exonuclease activity. The smaller subunits appear to be involved in holding the multisubunit structure together (via protein-protein interactions). Addition of the 66 kDa subunit appears to result in dimerization of the tetrameric complex, so Pol δ, like Pol III, may also be a dimeric replicating machine. Pol δ requires an associated 30 kDa protein, called proliferating cell nuclear antigen (PCNA), for full polymerase activity and processivity. Functional PCNA is a homotrimer (120 kDa) and acts as the processivity factor. The 66 kDa subunit is needed for interaction between Pol δ and PCNA. So unlike the situation in Pol III, the clamp does not interact directly with the catalytic subunit. In the presence of PCNA, Pol δ is highly processive, and it is PCNA that acts as a processivity factor. PCNA functions as a sliding clamp to increase the processivity of Pol δ up to 50-fold. It is similar to the β subunit of E. coli Pol III holoenzyme and phage T4 gp45 in structure, although there is little sequence homology (Fig. 29). In fact, the structures of PCNA, gp45, and β are virtually superimposable. The only major structural difference is that PCNA and gp45 are homotrimers, whereas β is a homodimer. (Krishna, T.S.R., Kong, X.-P., Gary, S., Burgers, P.M., and Kuriyan, J. (1994) Crystal structure of the eukaryotic DNA polymerase processivity factor PCNA. Cell 79: 1233-1243; for a review of Pol III, Pol δ, T4, and T7 processivity factors, see : Kelman, Z., Hurwitz, J., and O’Donnell, M. (1998) Processivity of DNA polymerases: two mechanisms, one goal. Structure 6: 121-125.) Replication factor-C (RF-C) Like β and gp45, PCNA requires a clamp loader to become associated with the template primer. The matchmaker in this case is an associated protein complex known as replication factor C (RFC). The RF-C complex consists of five subunits which together load PCNA onto the template primer in an ATP-dependent reaction, forming a pre-initiation complex. Pol δ (via the 66 kDa subunit) then interacts with PCNA and the primer terminus to form an initiation complex. Thus Pol δ is a typical replicase complex, similar to Pol III holoenzyme. Not surprisingly, proteins of the RF-C complex show functional and structural homology to the γ complex of Pol III and the simpler gp44/62 complex of T4. The reactions these complexes catalyze are very similar (Fig. 30). Because we have already discussed the γ complex in some detail, we will not explore RF-C in greater depth. 35 It is worth noting that in eukaryotic systems, RF-C and PCNA are regarded as pol accessory factors: e.g. Pol δ, RF-C and PCNA are sometimes referred to as the Pol δ complex. By contrast, in prokaryotes (E. coli) the γ complex and β are considered to be part of Pol III holoenzyme. This probably has more to do with the fact that the Pol III holoenzyme components are more tightly associated and can more easily be co-purified, than to any significant functional difference. Physiological role of Pol δ. There is much evidence that Pol δ is essential for replication; for example: • In yeast, the POL3 gene (which encodes the catalytic subunit) is essential for progression through the cell cycle. pol3 ts mutants are blocked in DNA synthesis and arrest in S-phase at non-permissive temperature. • Complete SV40 DNA replication in vitro requires RF-C, PCNA and Pol δ. • Pol δ and PCNA levels are increased in proliferating cells (generally parallel Pol α levels). These proteins are not expressed in quiescent cells (G0). In yeast, expression of these proteins is increased at the G1/S boundary. In vivo, the primary function of Pol δ is to elongate RNA/DNA primers laid down by Pol α (original observation from the SV40 system). Biochemical and genetic studies in yeast and mammalian cells suggest that Pol δ is also involved in some types of DNA repair synthesis. DNA polymerase ε Pol ε is another multisubunit nuclear DNA polymerase. The human pol has 4 subunits: 260 kDa-- pol catalytic subunit 59 kDa-- multimerization (?) 17 kDa-- structural; protein-protein interactions 12 kDa-- structural; protein-protein interactions The Pol ε catalytic subunit is one of the largest polymerase activities yet described. It also has 3' →5' exonuclease activity. The N-terminal portion contains the catalytically important residues. The C-terminal part consists of a 1,000 amino acid extension that is unique to Pol ε . This domain is known to interact with regulatory proteins, some of which are involved in cell cycle regulation (check point control). Pol ε is a reasonably processive enzyme that also associates with PCNA at a primer terminus. The significance of this is not clear, because Pol ε does not appear to be greatly stimulated by PCNA. 36 The physiological role of Pol ε . The role of this enzyme in DNA replication is controversial. Genetic analysis in yeast has revealed that POL2, the gene which encodes the large catalytic subunit, is essential for cell viability and DNA replication in yeast. It is necessary for the completion of S phase. This suggests that it is a replicative polymerase. Because of its high processivity and proofreading activity, it has been further suggested that Pol ε is involved in primer elongation, in addition to Pol δ. In support of this, Pol ε has been detected at the replication fork (along with Pol α and Pol δ) in mammalian cell extracts. Biochemical and genetic studies in yeast suggest that Pol ε is also involved in some DNA repair synthesis. Other evidence argues against a major role for Pol ε in bulk DNA synthesis. Unlike Pol δ, Pol ε is not required for complete SV40 replication in vitro and it is not an effective substitute for Pol δ in this system. Further, in yeast, deletion mutants lacking the entire N-terminal DNA polymerase domain of Pol ε are viable, suggesting that it is the C-terminal region of the protein which supplies an essential, non-redundant function. However, the N-terminal mutants accumulate DNA damage and show defects in cell cycle regulation of DNA synthesis. Clearly, Pol δ and Pol ε activities are not equivalent. It is possible that these polymerases have different, specialized roles in at the replication fork. Perhaps Pol ε plays some organizational role, or is involved in quality control/ cell cycle control. DNA polymerase β This is the smallest and simplest of the classical eukaryotic polymerases; it is composed of a single ~40-48 kDa protein. Like most single polypeptide enzymes, Pol β is not highly active and is not very processive. It has no intrinsic exonuclease activities. Its preferred template is duplex DNA with short gaps, although it can bind a nicked duplex and is capable of some limited displacement synthesis. Interestingly, however, it may also associate with PCNA. Pol β expression is considered constitutive. Expression levels are low and remain constant throughout the cell cycle, even in rapidly dividing cells. It is also expressed in quiescent cells. However, Pol β levels increase following treatment of cells with agents that damage DNA. In yeast, Pol β is encoded by POL4. Yeast pol4 mutants are viable; indicating that Pol β is not essential for replicative synthesis. The mutants are, again, sensitive to certain mutagenic agents. All of these observations suggest that Pol β is primarily involved in DNA repair. 37 Polymerase switching in eukaryotic DNA replication Studies that have defined the replication fork in eukaryotes have employed the SV40 system. Using viral DNA template (5 kb circular dsDNA, or plasmids containing a viral origin of replication) and a single viral protein (T-antigen, which functions as an initiation protein and a helicase) it has been possible to achieve complete replication with highly purified cellular proteins. The cellular proteins/complexes required are: Pol α :primase Pol δ, PCNA, RF-C (Pol δ complex) FEN1 (MF-1), RNase H, DNA ligase I (Okazaki fragment processing) RPA (ssDNA binding protein) DNA topoisomerase I and II (conformational issues) SV40 T-antigen (initiation protein and helicase) For our purposes at this time, we will focus on the polymerases. (Waga, S., and Stillman, B. (1994) Anatomy of a DNA replication fork revealed by reconstitution of SV40 replication in vitro. Nature 369: 207-212.) The current picture of the eukaryotic replication fork (with respect to polymerase activities) is as follows: (Fig. 31) Pol α , with its associated primase activity and lower processivity, is ideally suited for priming both leading and lagging strand synthesis. Priming of the leading strand occurs only once at each replication fork. Lagging strand synthesis requires repeated initiation; each Okazaki fragment requires priming. The product of Pol α :primase action is iRNA/iDNA. RF-C (clamp loader) associates with the 3'OH terminus of the DNA primer and loads a PCNA clamp on the template-primer, displacing Pol α :primase. RF-C and PCNA probably prevent continued (unedited) synthesis of DNA by Pol α and limit its activity to a few incorporated resides. (RNA-DNA primers syntesized are shorter if RF-C and Pol δ are present.) Pol δ then associates with the PCNA clamp and the primer terminus and begins processive extension of the primer until the Okazaki fragment is completed. Recent evidence indicates that Pol δ is probably dimeric, and possibly asymmetric. By analogy with Pol III, two δ pols may simultaneously replicate the leading and the lagging strand. This would help to coordinate leading 38 and lagging strand synthesis. Pol α :primase may preferentially associate with the Pol δ activity replicating the lagging strand. In any case, both leading and lagging strand synthesis requires a polymerase switch, from Pol α :primase to Pol δ. Pol δ elongates both the leading strand primer and lagging strand primers in the SV40 system. Pol ε may also elongate some RNA-DNA primers in vivo, although this is not clear. (For review see: Stillman, B. (1994) Smart machines at the DNA replication fork. Cell 78: 725-728). Other eukaryotic DNA polymerases Now that we have learned about the eukaryotic nuclear polymerases, it is useful to look at a few interesting viral systems. We will also discuss a unique and essential cellular polymerase that is specialized to function at chromosome ends. Eukaryotic viruses can be placed into four broad groups on the basis of polymerase activities used during replication. • Small viruses that rely on host DNA polymerases to replicate their genomes (e.g. SV40 and polyoma virus in mammals; plant geminiviruses). • Medium to large viruses that encode their own DNA polymerases and accessory proteins. Examples include herpesvirus, poxvirus, and adenovirus. • Viruses that encapsidate their genome as RNA but replicate it through a DNA intermediate (retroviruses). These agents use a virus coded RNA dependent DNA polymerase (reverse transcriptase) and host DNA-dependent RNA polymerase to replicate their genomes. Examples include avian sarcoma virus (ASV), human immunodeficiency virus (HIV), and human T cell lymphotrophic virus (HTLV). • Viruses that have genomes composed of RNA and do not use DNA polymerases during replication (eg. influenza virus, poliovirus, tobacco mosaic virus and most plant viruses, phage Qβ). These viruses encode RNA-dependent RNA polymerases (RdRP). Cellular RdRPs also exist; these are involved in gene regulatory pathways (such as gene silencing) that utilize small RNA molecules (microRNAs and small interfering RNAs) to degrade mRNAs in a sequencespecific manner. We will not discuss RNA viruses and RdRPs further. We will briefly talk about herpesvirus and retrovirus systems. 39 Herpesvirus DNA polymerase Herpesviruses infect a variety of animal species, but mostly mammals and birds. Some examples include herpes simplex virus (cold sores, fever blisters, genital lesions, sporadic encephalitis, systemic neonatal infections), varicella zoster virus (chicken pox, shingles), cytomegalovirus (deafness and retardation in fetus; retinitis, wasting disease, pneumonia in immunocompromised adults), and Epstein-Barr virus (infectious mononucleosis, Hodgkin’s and Burkitt's lymphoma, nasopharyngeal carcinoma). The large herpesvirus genome (~153 kbp, HSV) is replicated by viral DNA polymerase with the aid of six virus coded accessory proteins. A model system to study herpesvirus replication consists of cells transfected with a plasmid containing a viral origin of replication, and plasmids expressing viral replication proteins. The viral DNA polymerase exists as a heterodimer of UL30 (140 kDa), which is the catalytic subunit (pol), and UL42 (52 kDa), a protein with strong affinity for dsDNA. The pol subunit has an intrinsic 3' → 5' exonuclease activity. The pol/UL42 polymerase holoenzyme is highly processive, and it is UL42 that serves as the processivity factor. That is, UL42 serves anchors the polymerase to the template. However, UL42 is a novel type of clamp. The mechanism differs from that of the β subunit of Pol III and PCNA in several respects: • UL42 has a high intrinsic affinity for DNA; it strongly binds dsDNA in a sequence nonspecific manner. • UL42 doesn't require a clamp loader or ATP to be loaded on the template primer. • UL42 does not form a ring around the DNA. However, it structurally resembles a partial PCNA ring. The mechanism also differs from the thioredoxin clamp of T7 polymerase, because by itself UL42 has a strong affinity for dsDNA, while thioredoxin does not. Rather, it appears that the high affinity of UL42 for dsDNA and pol (which it binds simultaneously) increases the affinity of holoenzyme for the template primer. How this can occur without slowing elongation is not entirely clear. But recent structural data suggests that electrostatic attraction between a large, positively charged surface of UL42 and the negatively charged DNA acts a non-specific tether, holding UL42 in close proximity to the DNA without preventing its diffusion along the DNA backbone. In short, UL42 may work like an electrostatic 40 “tractor field”. (Zuccola, H.J., Filman, D.J., Coen, D., and Hogle, J.M. (2000) The crystal structure of an unusual processivity factor, herpes virus UL42, bound to the C-terminus of its cognate polymerase. Molecular Cell 5: 267-278.) (For a review of herpesvirus replication see: Boehmer, P.E., and Lehman, I.R. (1997) Herpes simplex virus DNA replication. Ann. Rev. Biochem. 66: 347-384.) Controlling virus infections Selective inhibition of virus replication is an attractive means of controlling virus infections, and is one of the major goals of researchers studying how viruses replicate. We will use herpesviruses as an example of how selective inhibition can be achieved. Of course, viruses are also valuable model systems for the analysis of cellular replication mechanisms. Sensitivity of polymerase to inhibitors. Obviously, selective inhibition of viral polymerase is an attractive method for controlling virus infections. Herpesvirus DNA polymerase can be distinguished from host polymerases by its sensitivity to inhibitors. Both host polymerases and HSV pol are sensitive to the nucleoside analogue aphidicolin, which acts as a chain terminator. But only the HSV pol is sensitive to phosphonoacetic acid, which inhibits release of pyrophosphate (Fig. 32). Unfortunately, phosphonoacetic acid is toxic and drug resistant mutants emerge rapidly. A derivative, phosphonoformic acid, is less toxic and has some clinical value. Interference with viral replication protein functions/interactions. Another way to inhibit virus replication is to employ compounds that interfere with the interaction of viral replication proteins (e.g. pol:UL42). Actually, it turns out that all of the essential herpesvirus replication proteins directly or indirectly interact with each other. Knowledge of the functions of viral replication complexes, and how the proteins within them interact, will eventually permit the design of drugs (small molecules) that selectively interfere with these interactions, thereby blocking virus replication. Such inhibitors are not yet available. Target cell approach Another way of controlling virus infections is the target cell approach, that is, to use drugs that can only enter or be activated in infected cells. The most effective selective inhibitor of this type is the nucleoside analogue acyclovir (acycloguanosine). Acyclovir is an effective agent against some herpesviruses because it is a substrate for viral, but not host, thymidine kinase (TK). As a result, acyclovir is converted into the triphosphate form (i.e. activated) only in infected cells. (Actually, HSV TK converts acyclovir to a monophosphate, and cellular enzymes convert it to triphosphate). 41 TK and HSV-TK Thymidine → TMP → TTP HSV-TK Acyclovir → Acyclovir-MP → Acyclovir-TP Acyclovir has therapeutic value only against herpesviruses like HSV and varicella-zoster which encode a kinase that can convert acyclovir to an activated substrate for DNA polymerase. It has no activity against cytomegalovirus (CMV), which does not encode a TK. Acyclovir triphosphate is used as a substrate in place of dGTP. When it is incorporated into DNA it acts as a chain terminator. It is a much more potent inhibitor of viral polymerase than it is of cellular polymerases. Also, it is not a substrate for 3'→ 5' exonuclease and so is not removed from the 3' terminus. Retrovirus reverse transcriptase Retroviruses package their genomes as ssRNA but replicate this RNA through a dsDNA intermediate. To do this, they employ an RNA dependent DNA polymerase (reverse transcriptase). Retroviruses primarily infect mammals and birds. Examples of important human retroviruses include human T cell lymphotrophic virus (HTLV) and human immunodeficiency virus (HIV). Reverse transcriptases are also used by related viruses that package DNA in their capsids and replicate through RNA intermediates. For example, hepatitis B virus (hepadnavirus) and cauliflower mosaic virus. In addition, some types of cellular transposons (retroposons) also encode reverse transcriptase. Examples include Ty in yeast, and copia and P elements in Drosophila. Reverse transcription is part of the transposition process in these elements, some of which resemble retroviruses in their structure and organization. So, information flow from RNA to DNA appears to occur in many systems. Properties of reverse transcriptase. RT is a typical DNA polymerase in the 5'→ 3' direction of synthesis, requirement for a template 2+ primer with a 3'-OH terminus, and requirements for dNTPs and Mg . RT does not have a detectable DNA specific exonuclease activity, and therefore has no proofreading function. As a result, its error rate is relatively high. This is reflected in a high mutation rate; virus variants are generated at high frequency. This is an important aspect of pathogenesis: retroviruses are adept at evading host immunosurveillance because of high mutation frequency. 42 RT is unique among DNA polymerases in at least two respects: • It can use primed, natural ssRNAs as template. It can also use a primed ssDNA as template. • It has intrinsic RNase H activity. RNase H is a processive exonuclease that specifically degrades the RNA strand of a DNA-RNA hybrid beginning from either the 5' or 3' end. It can also act as an endonuclease. RNase H hydrolyzes phosphodiester bonds to leave products with 3' hydroxyl and 5' phosphate ends. The RT holoenzyme is a dimer of two related proteins. The β subunit (63-66 kDa) has polymerase activity and RNase H activity. The α subunit (51 kDa) has neither activity, and appears to function in initiation. The enzyme is relatively processive and can replicate the 8 kb retrovirus genome without a processivity factor. Avian myeloblastosis virus (AMV) RT is well studied and is often used in the lab, primarily for the synthesis of cDNA. HIV RT is also well studied, for obvious medical reasons. Its crystal structure has been solved, and shows remarkable structural homology to Pol I. In particular, it has a similar cleft, as well as palm, finger, and thumb domains (Fig. 33) (For review see: Joyce, C.M., and Steitz, T.A. (1994) Function and structure relationships in DNA polymerases. Ann. Rev. Biochem. 63: 777-822.) Retrovirus replication A simplistic picture of the retrovirus replication cycle is presented here. Note that retroviral replication requires both the RNA dependent and DNA dependent DNA polymerase, and RNase H activities of the retrovirus reverse transcriptase, as well as transcription by host RNA polymerase II (a DNA dependent RNA polymerase). The retrovirus RNA genome is typically about 8 kb in length. The first step in replication involves synthesis of an ssDNA copy (cDNA) of the genomic viral RNA (Fig. 34). Synthesis is accomplished by reverse transcriptase, and is primed by a host tRNA which hybridizes to the template RNA. The product is a DNA-RNA hybrid. In the next step, the RNA is degraded by RNase H, and second strand DNA synthesis, also catalyzed by reverse transcriptase, yields a duplex DNA. Note that the replication mechanism requires RT to use both an RNA and a DNA template. 43 The duplex DNA is integrated into the host cell genome (catalyzed by a viral integrase), where its genes are transcribed by host RNA polymerase II to generate mRNAs encoding viral proteins. The integrated copy of the viral genome is called a provirus. (In this form, the virus is essentially invisible to the hosts’ immune system.) Full-length transcripts generated from the provirus are assembled into virus particles to complete the replication cycle. Telomeres and Telomerase Telomeres are specialized structures found at the ends of linear, eukaryotic chromosomes. They are an essential feature of linear chromosomes that "seal" the chromosome end and prevent the loss of terminal DNA sequences. Telomeres have two components. 1) short DNA sequences (typically 6 to 26 nt long) that are repeated many times. The number of repeats is somewhat variable, but is usually maintained within relatively narrow limits within a species. The number of repeats can vary widely between species, however. 2) specialized proteins that interact with the DNA sequences. Each species has a characteristic telomeric DNA sequence, although the same sequence can occur in more than one species. One strand, the one found at the 3' end of each chromosomal DNA strand, contains clusters of G-residues, for example: Tetrahymena Human Yeast TTTTGGGG TTAGGGG TGTGGGTGTG The lagging strand problem Why are telomere repeats needed? Because of the 3' end problem (Fig. 35). Consider: following replication, the 3' end of the newly synthesized chromosome can't be copied, because there is no primer available. If not corrected, this would result in the loss of sequences following each chromosome duplication. This problem is avoided by the de novo addition of telomeric sequence by telomerase. (For review see: Greider, C.W. (1996) Telomere length regulation. Ann. Rev. Biochem. 65: 337-365.) Telomerase This enzyme is a specialized reverse transcriptase: that is, it synthesizes DNA using an RNA template. However, unlike conventional reverse transcriptase, it is a ribonucleoprotein. The protein and RNA components make up an active enzyme of ~200 kDa. The RNA component 44 (~1.3 kb in yeast) contains the template sequence that is used for DNA synthesis. (So telomerase carries its own template.) Genes encoding the RNA subunit have been isolated and cloned from several sources, including ciliates (Tetrahymena and Euplotes), yeast, and mammals. Disruption of this gene (TLC1 in yeast) results in progressive shortening of telomeres by several bp/generation. Introducing a change in the sequence of the TLC1 gene (within the RNA sequence used as template) results in that change being introduced into telomeres: e.g. TGTGGGTGTG to TGTGTGGGCCTG. The protein subunits of telomerase have been identified and cloned from budding yeast, fission yeast, Euplotes, and human. There appears to be two protein subunits, a large catalytic subunit (103-127 kDa) and a small subunit of about 43 kDa. The large subunit shares important sequence motifs with HIV RT and other DNA polymerases. In S. cerevisiae, the catalytic subunit corresponds to the Est2 gene (ever shorter telomeres). Gene disruption in S. pombe also results in telomere shortening and rapid senescence. How does telomerase work? In vitro, telomerase from a given species synthesizes the G-rich strand sequence characteristic of the species. All that is required is a telomere DNA sequence primer and the appropriate dNTPs. For example, given the oligonucleotide TTGGGG, Tetrahymena telomerase will repeatedly add the same telomere sequence to the 3' end, to yield: TTGGGG(TTGGGG)n, where n can be >100. So, the strategy used by the cell to avoid the 3' end problem is to increase the length of the 3' end of the chromosome. This allows the 5' end to be lengthened by conventional priming and DNA synthesis. Because the telomere sequence is repeated, it doesn't really matter what the exact number of telomere repeats is, as long as there are enough. The proposed mechanism by which telomerase adds to the 3’ end is as follows: Using a short, complementary sequence in its RNA component as template, the enzyme adds nucleotides (the telomeric repeat), then translocates and repeats the process (Fig. 36). How is telomere length controlled? This is not entirely clear, and the mechanism may vary between organisms. In yeast and ciliated protozoa like Tetrahymena, telomerase is expressed constitutively, but its activity appears to negatively regulated by other proteins. Disruption of genes encoding these regulatory proteins (in yeast) results in massive telomere elongation. 45 In contrast, in differentiated (quiescent) human somatic cells (which start with ~10 kb of telomerase sequence at each chromosome end), telomerase activity can't be detected and telomeres shorten progressively with age and with each cell division (a marker for cell age?). However, telomerase activity is present in actively dividing cells, including immortalized (transformed) cells in culture, and in most cancer cells. So telomerase may be useful for cancer diagnostics, and is a possible target for therapeutics. 46