s __ l!B d - PHYSICS __ ELSEVIER Physics Reports 288 (1997) REPORTS 13-60 Biophysics of the DNA molecule Maxim D. Frank-Kamenetskii Cenier jtir Adwncrd PACS: Biotechrwlogy und Department oJ’ Biomrdical Engineering, 36 Cunmin~gton St.. Boston, MA 02215, USA Bostm Unicrrsity, 87.15.-v; 87.15.Da; 87.15.He; 87.15.Kg Keywords: DNA; Topology; Gel electrophoresis; Polyelectrolyte; Knots; PNA 1. Introduction DNA plays a crucial role in all living organisms because it is the key molecule responsible for storage, duplication, and realization of genetic information. DNA is a heteropolymeric molecule consisting of residues (nucleotides) of four types, A, T, C and G. Fig. 1 shows the chemical structure of the DNA single strand and the complementary base pairs. The genetic message is “written down” in the form of continuous text consisting of four letters (DNA nucleotides A, G, T and C). This continuous text, however, is subdivided, in its biological meaning, into sections. The most significant sections are genes, parts of DNA, which carry information about the sequence of amino acids in proteins. The importance of the DNA molecule cannot be overestimated. It is therefore natural that the molecule has been attracting attention not only of biologists and physicians but also of chemists and physicists, even theorists (for a popular introduction into the field of DNA science, see, e.g., Frank-Kamenetskii, 1993; 1997). For more than forty years already, the DNA molecule has been a subject of biophysical studies. Many outstanding physicists, who made their names in various areas of traditional physics, mostly in solid state physics, contributed by studying DNA. I.M. Lifshitz did not publish many papers about DNA. Nevertheless, his role in directing attention of physicists toward DNA biophysics was very significant, especially in the USSR. Although I had been already in the field when Lifshitz stormed it in mid-1960s I also experienced profound influence of his personality and style. Due to his enthusiasm and indisputable reputation among Soviet physicists, DNA and protein biophysics temporarily became a focus of attention of the Soviet physics community. This community was a unique phenomenon in the world of science. Due to I.M. Lifshitz, I had an opportunity to present my work on DNA topology at the famous 0370-1573/97/$32.00 Copyright PI/ SO370-1573(97)00020-3 0 1997 Elsevier Science B.V. All rights reserved 14 vO Ov= .o>:o 0 BASE \* -11 0 P 0 0 BASE O-v= \ 0 BASE 0 (a) d C G (4 Fig. 1. (a) DNA single strand and (b) the Watson-Crick complementary base pairs. Landau seminar at the Kapitza Institute, an unforgettable experience by itself. By that time (it was in 1975, 1 believe), Lifshitz had replaced Lev Landau as head of Theoretical Department of the Vavilov Institute of Physical Problems (the official name of the Kapitza Institute; physicists called it either “Physproblems” or “Kapichnik”) and he led the seminar. When I tried to start my talk, the participants did not give me the opportunity to say a word shouting: “What is he going to speak about?“, “He must first say what he is going to speak about”. The noise was really terrible, and I was confused. It lasted for a while until Lifshitz stood up and said, not loudly, actually: “Stop screaming. Let him get started”. Magically, these quiet words calmed everybody down and I started my talk. Of course, I was interrupted many times during the talk, but these were more or less usual questions. In this article, I give an overview of the area of DNA biophysics in retrospective. The field is now so big that it would be impossible to a single person to cover all aspects of DNA biophysics. The choice of topics and their coverage will inevitably reflect the writer’s personal taste and interest. 2. Major structures of DNA In spite of the enormous versatility of living creatures and, accordingly, variability of genetic texts that DNA molecules in different organisms carry, they all have virtually identical physical, spatial structure: the double-helical B form discovered by Watson and Crick (1953). Sequences of the two strands of the double helix obey the complementarity principle. This principle is the most important law in the field of DNA, and, probably, is the most important law of the living nature. It declares that, in the double helix, A always opposes T and visa versa, whereas G always opposes C and visa versa (see Fig. 1). M. D. Frank-Kanwnetskii I Physics Repouts 288 (I 997) 13ST) major minor 15 groove groove Fig. 2. B-DNA. The major and minor grooves are indicated. 2.1. B-DNA B-DNA (see Fig. 2) consists of two helically twisted sugar-phosphate backbones stuffed with base pairs of two types, AT and GC. The helix is right-handed with 10 base pairs per turn. The base pairs are isomorphous: The distances between glycosidic bonds, which attach bases to sugar, are virtually identical for AT and GC pairs. Because of this isomorphism, the regular double helix is formed for an arbitrary sequence of nucleotides and the fact that DNA should form a double helix imposes no limitations on DNA texts. The surface of the double helix is by no means cylindrical. It has two very distinct grooves: the major groove and the minor groove. These grooves are extremely important for the functioning of DNA because, in the cell, numerous proteins recognize specific sites on DNA via binding with the grooves. Each nucleotide has a direction and therefore the chemical direction is inherent in each of DNA single strands. In the B-DNA double helix, the two strands have opposite directions. In B-DNA, base-pairs are planar and perpendicular to the axis of the double helix. Under normal conditions in solution, often referred to as “physiological” (neutral pH, room temperature, about 200mM NaCl), DNA adopts the B form. All available data indicate that the same is true for the totality of DNA within the cell. It does not exclude, however, the possibility that separate stretches of DNA carrying special nucleotide sequences would adopt other conformations. 2.2. B’-DNA Up to now, only one such conformation is demonstrated, beyond any doubts, to exist under physiological conditions. When several A residues in one strand (and, accordingly, several T residues in the other DNA strand) occur, they adopt the B’ form. In many respects, the B’ form is similar to the classical B form but there are also significant differences. The main difference consists in the fact that base pairs in B’-DNA are not planar: They form a kind of propeller with a propeller twist of 20”. 16 hf. D. Frank-Kamenetskii I Physics Reports 288 (I 997j 13-60 Stretches of A residues produce bends in the double helix (reviewed by Sinden, 1994). Such bends play a very important role in DNA functioning. Although the structural basis of these bends is not fully understood, the involvement of the B’ form in the DNA bending is very probable. In spite of its importance, the B’ form does not differ dramatically from the B conformation. Other helical conformations have been found in the course of DNA biophysical studies, which are significantly different from B-DNA. 2.3. A-DNA Similarly to B-DNA, the A form can be adopted by an arbitrary sequence of nucleotides. Like in B-DNA, in A-DNA the two complementary strands are antiparallel and form right-handed helices. DNA undergoes transition from the B to A form under dehydration conditions (reviewed by Ivanov and Krylov, 1992). In A-DNA, the base pairs are planar but their planes make a considerable angle with the axis of the double helix. In doing so, the base pairs shift from the center of the duplex forming an empty channel in the center. If any, A-DNA plays a rather modest role in DNA functioning. There are data indicating that some proteins induce transition from B to A form. The reason for this remains to be elucidated. 2.4. Z-DNA Z-DNA presents the most striking example of how different from the B form the DNA double helix can be. Although in Z-DNA the complementary strands are antiparallel like in B-DNA, unlike in B-DNA, they form left-handed, rather than right-handed, helices. There are many other dramatic differences between Z- and B-DNA (reviewed by Dickerson, 1992). Not any sequence can adopt the Z form. To adopt the Z form, the regular alternation of purines (A or G) and pyrimidines (T or C) along one strand is strongly preferred. However, even this is not enough for Z-DNA to be formed under physiological conditions. Nevertheless, Z-DNA can be adopted by DNA stretches in cell due to DNA supercoiling (see Section 4.3.1). The biological significance of Z-DNA, however, remains to be elucidated. 2.5. ps-DNA The complementary strands in a DNA duplex can be parallel. Such parallel-stranded (ps) DNA is formed most readily if both strands catty only adenines and thymines and their sequence excludes formation of the ordinary antiparallel duplex (reviewed by Rippe and Jovin, 1992). If these requirements are met, the parallel duplex is formed under quite normal conditions. It is right-handed, but the AT pairs are not the usual, Watson-Crick ones, but rather the so-called reverse WatsonCrick. Some other sequences also can adopt parallel duplexes. For instance, at acidic conditions two strands carrying only C residues form parallel duplex consisting of protonated CC+ base pairs (see Section 2.7). M.D. Frunk-Kamenetskii I Physics Reports 288 (I 997) 13~60 17 R li AAT C+ G C k TAT GGC (4 Fig. 3. The structure triple helix is made. 2.6. of (a) pyrimidine lb) (TAT and C+GC) and (b) purine (AAT and CCC) base triads of which DNA Triplexes If DNA carries a homopurine-homopyrimidine tract, a homopyrimidine oligonucleotide can bind to this tract lying in the major groove and forming Hoogsteen pairs with DNA bases (Moser and Dervan, 1987; Le Doan et al., 1987; Lyamichev et al., 1988; reviewed by Frank-Kamenetskii and Mirkin, 1995; Soyfer and Potaman, 1996). The canonical base-triads thus formed are shown in Fig. 3. In recent years, the variety of sequences, which have been found to be capable to form triplexes, has been significantly enlarged (reviewed by Frank-Kamenetskii and Mirkin, 1995; Soyfer and Potaman, 1996). In addition to intermolecular triplexes, intramolecular triplexes or H-DNA can be formed under certain conditions (see Section 4.4.3). 18 M. D. Frank-Kummetskii I Physics Reports 288 (1997) 13h50 Fig. 4. G quadmplex. 2.7. Quadruplexes Of all nucleotides, guanines are the most versatile in forming different structures. They may form GG pairs but the most stable structure, which is formed in the presence of monovalent cations (especially potassium), is G4 quadruplex (see Fig. 4). G-quadruplexes may exist in a variety of modifications: all-parallel, all-antiparallel and others (reviewed by Sinden, 1994). As a result, G-quadruplexes are easily formed both inter- and intramolecularly, again with a variety of modifications. A totally unusual quadruplex structure was discovered by Gehring et al. (1993). It contains two hemiprotonated parallel-stranded duplexes consisting of CC+ pairs. The two parallel-stranded duplexes are associated in a mutually antiparallel manner so that CC+ base pairs from one duplex are “layered” by CC+ pairs from the other duplex, thus alternating along the structure. 3. Methods to study DNA (General) The whole arsenal of physical methods, which are generally used to study molecular structures, is applied to studying DNA. In this section we will briefly consider the most important of these methods emphasizing their role in the field of DNA biophysics. 3.1. X-ray analysis As in other fields where molecular structure is essential, X-ray analysis occupies a unique position among methods to study DNA structure as the only direct method, which permits to elucidate the structure in all details. X-ray crystallography is absolutely indispensable in the study of the detailed structure of complexes of DNA with proteins, which is most essential for understanding how DNA molecules function in the cell. M.D. Frunk-Kamenetskiil Physics Reports 288 ilW7) 13MO 19 However, for a long period of time the whole edifice of molecular biology relied on one of the indirect versions of the X-ray analysis, fiber diffraction, rather than on classical X-ray crystallography. 3.1. I. Fiber d$%action Long DNA molecules cannot be crystallized. As a result, from early 1950s till late-l 970s only X-ray diffraction from DNA fibers was used to elucidate the DNA structure. Such data were used by Watson and Crick (1953) to propose structures for B- and A-DNA. Fiber diffraction is an essentially indirect method, and, to elucidate structure from the data on fibers, one should heavily rely on theoretical approaches, such as conformational analysis. 3.1.2. X-ray crystallography First DNA crystals became available only in the late 1970s after remarkable progress in chemical synthesis of short DNA pieces had been achieved. This led to many discoveries, first of all of left-handed Z-DNA by Rich and co-workers (Wang et al., 1979). In recent years, a lot of very detailed data on the structure of DNA have been obtained by X-ray crystallography, including detailed study of the B, B’, A and Z forms (reviewed by Dickerson, 1992). Among the most recent achievements of the method is solution of the structure of G-quadruplex (Kang et al., 1992) and of C-quadruplex (Chen et al., 1994). In spite of remarkable accomplishments, serious limitations are inherent in the method. It is extremely hard to obtain good crystals of DNA even if it adopts only a unique structure. Even if this difficulty is overcome, it sometimes appears that the structure in crystal is significantly perturbed by interaction with neighboring molecules. It is especially true of such subtle, but very important from a biological viewpoint, deformations as bending of the double helix. Thus, the data obtained by the methods of X-ray crystallography should always be correlated with the results of indirect methods, which permit to study DNA in solution. X-ray crystallography is totally helpless to study such biologically significant problems of DNA biophysics as DNA supercoiling (see below). 3.2. Nuclear magnetic resonance (NMR) The role of NMR constantly increases, and, in recent years, it has even started to compete successfully with X-ray crystallography in the field of DNA biophysics. This has become possible as a result of development of two-dimensional proton NMR techniques, especially nuclear Overhauser effect spectroscopy (NOESY). The great advantage of NMR, as compared with X-ray crystallography, consists in the fact that it does not require crystals. As a result, some DNA structures, that resist crystallization, like DNA triplexes, have become the subjects of very fruitful study by NMR (Rajagopal and Feigon, 1989). A great advantage of NMR consists in the possibility of studying structural fluctuations, or “breathing”, of the DNA double helix by following proton exchange in DNA bases. Such studies made it possible to find out important characteristics of base-pair fluctuational openings in DNA (see Section 7.2). Turning to the limitations of the method, it should be emphasized that resolution of even the most powerful NMR spectrometers permits the study of only short DNA molecules containing about a dozen of distinguishable nucleotides. Although crystals are not needed, only very concentrated solutions can be studied. Therefore, like X-ray crystallography, NMR is useless for studying 20 M.D. Frank-Kamenetskiil Physics Reports 288 (1997) 13p60 many biologically relevant problems of DNA structure. In spite of its limitations, proton NMR has firmly occupied the second position, after the X-ray crystallography, among methods to study DNA structure. 3.3. Microscopic Although microscopy looks like the most direct way to visualize structure, in application to DNA it has too numerous limitations to occupy a position ahead of X-ray crystallography or NMR. Nevertheless, in recent years the role of microscopy in the field of DNA has significantly increased due to progress in regular electron microscopy of DNA and its complexes and the development of new techniques, cryoelectron microscopy and scanning force microscopy. 3.3.1. Regular electron microscopy In regular transmission electron microscopy, DNA molecules are placed on the grid, dried and contrasted by one or another method. In recent years the most popular technique of contrasting has become staining with uranyl acetate. As a result, duplex DNA molecules and proteins attached to them are clearly seen (see, e.g., Chemy et al., 1993a). Regular electron microscopy permits one to see complexes of DNA with proteins. Poor resolution usually does not permit one to observe the internal structure of the complex but permits mapping of the location of the protein on DNA. A limitation of regular electron microscopy stems from relatively poor resolution and possibility of significant perturbation of DNA structure in the process of sample preparation. And still, in many cases the method provides the most convincing evidence. 3.3.2. Cryoelectron microscopy The method is based on obtaining vitrified water solutions via very quick cooling of extremely thin (in the submicron range) samples. As a result, the DNA molecule is “frozen” in the state it adopted in solution before cooling. In recent years, the method has received numerous applications in the field of DNA and its complexes with proteins (Dubochet et al., 1992). The great advantage over regular electron microscopy consists in avoiding the harsh procedures of sample preparation, which strongly limits the value of the data obtained by regular electron microscopy. A major problem of cryoelectron microscopy stems from the low contrast of DNA molecules. Without staining or other contrasting procedures they are barely visible in an electron microscope. Nevertheless, DNA molecules and their complexes with proteins are extensively studied by the method (reviewed by Dubochet et al., 1992). 3.3.3. Scanning jbrce microscopy Scanning (or atomic) force Microscopy (SFM or AFM) provides reliable images of DNA molecules and their complexes with proteins (reviewed by Shao et al., 1995). At present, the resolution of this method is not much higher than that of regular electron microscopy. However, there is reasonable hope that the resolution will be significantly improved in the course of further development of the method. M.D. Frank-Kanwnrtskiil Physics Reports 288 (1997) 13-60 21 3.4. Optical methods All optical methods that are traditionally applied to study molecular structures, are widely used to study DNA. They are indispensable in routine investigations because they are cheap and quick. The great advantage of those that use light in the visible and the UV region consists in the possibility of studying very dilute DNA solutions, for which intermolecular interaction can be completely neglected. On the other hand, all these methods are essentially indirect and provide any structural information only after a careful assignment of particular spectra or spectra1 changes with the help of the more direct methods described above. 3.4. I. U If spectroscopy DNA bases absorb UV radiation around 260nm. The intensity of this absorption, which is easy to measure with regular spectrophotometers, changes when, for instance, the DNA double helix melts (i.e. the complementary strands separate at heating, see Section 6). Hence, UV spectroscopy has been extensively applied to study DNA melting. Changes of UV absorption are too small for B-to-A and B-to-Z transitions. 3.4.2. Ciuzlar dichvoism (CD) CD spectra in the vicinity of 260nm are much more sensitive to DNA helical structure than UV absorption. B-DNA, A-DNA, and Z-DNA have characteristic and very different CD spectra (Johnson, 1990), and this fact is extensively used in the study of structural transitions in DNA between different helical structures (Ivanov and Krylov, 1992). 3.4.3. Inji-ared and Raman spectvoscop_v Infrared and Raman spectra are sensitive to DNA structure. Correspondingly, IR and Raman spectroscopies are used to study DNA. However, the main limitation of these methods stems from the fact that they require high concentrations of DNA. As a result, these methods are less popular than UV and CD spectroscopies. 3.4.4. Fluorescent methods DNA molecules practically do not emit absorbed radiation. Fluorescence methods are used via binding to DNA of strongly fluorescent dye molecules, such as ethydium. Fluorescence sensibilization and quenching due to excitation energy transfer between the donor molecule of electronic excitation and the acceptor molecule of the excitation are also used (reviewed by Clegg, 1992). 3.5. Theoretical methods The paper that signified the beginning of extensive studies of DNA and its biological role, was purely theoretical (Watson and Crick, 1953). Since then, theory has been playing a very important role in study of DNA. 22 (4 Fig. 5. Theoretical models of DNA: (a) elastic-rod model. (b) helix-coil model. (c) polyelectrolyte model 3.5.1. Conjbrma tional analysis Conformational analysis was especially important during the era of fiber diffraction. In fact, what Watson and Crick did in their classical paper (Watson and Crick, 1953) was a very simple, but exceptionally efficient, variety of conformational analysis. Since then, the method has been extensively used to refine structures solved by X-ray crystallography (Dickerson, 1992). The method is also often used to predict new structures. For instance, parallel-stranded DNA duplexes were first predicted theoretically and then found experimentally. 3.5.2. Theoretical models Like in the study of any important physical object, a number of simplified theoretical models of DNA exist, different models being used to analyze different properties. Fig. 5 presents schematics of some of these models. The DNA double helix may be treated as an isotropic elastic rod (Fig. 5(a)). In the framework of this model, the DNA molecule is described by only three parameters: bending and torsional rigidities and the diameter. The model has proved to be extremely useful to analyze hydrodynamic and other properties of linear DNA, when it behaves as a usual polymeric molecule. It also permitted comprehensive theoretical treatment of DNA topology of both levels - knotting and supercoiling. We discuss this model at some length in Section 4. A quite different, but also very successful, model treats the DNA double helix as consisting of base-pairs of two types: closed and open (Fig. 5(b)). This is the helix-coil model, which has permitted to explain quantitatively all major features of DNA melting. We discuss this model in Section 6. The polyelectrolyte model (Fig. 5(c)) treats DNA itself just as a charged cylinder but allows for the mobile counterions surrounding the double helix. We discuss this model in Section 8. M.D. Frunk-Kumenet.~kii/ Physics Reports 288 (1997) 13-60 23 4. Global DNA conformation 4.1. Elastic rod rnodeE of DNA DNA behaves as an almost ideal polymer chain. No other polymer molecule is closer to the ideal polymer chain than the DNA double helix. Due to unusually high bending rigidity of DNA, the ratio of its persistence length, a, to its diameter, d, is very high. This leads to very small, sometimes negligible, excluded volume effects under a variety of ambient conditions, not only at the O-point, like with ordinary polymers (see, e.g., Grosberg and Khokhlov, 1994). This unusual rigidity stems from the fact that DNA consists of two, rather than one, polymer chains. A common mechanism of polymer flexibility, due to rotation around single bonds, is excluded for the double helix. It exhibits bending flexibility only due to accumulation of small changes of angles between adjacent base pairs. As a result, the DNA double helix is best modeled as an elastic rod (see Fig. 5). Within first approximation, one can neglect the sequence dependence of the DNA bending and torsional rigidities and treat DNA as a homogeneous and isotropic elastic rod. This model proved to be a remarkably good first approximation to treat global DNA macromolecular properties. Within the framework of this model, the DNA chain is characterized by three parameters: The bending rigidity, measured in terms of the persistence length, a, or the Kuhn statistical length (h = 2~); the torsional rigidity C; the DNA effective diameter, d. Numerous properties of linear and circular DNA molecules can be quantitatively understood in terms of the elastic rod model and the same set, under given ambient conditions, of the above three parameters. Ambient conditions, especially the concentration of counterions in solution, may significantly affect some of DNA parameters. This is the case for the DNA effective diameter. Because DNA is a highly charged polyion, the excluded volume effects strongly depend on the screening of Coulomb interaction between DNA segments approaching each other (see Section 8). As a result, the DNA effective diameter significantly exceeds its geometrical diameter of 2 nm at a low concentration of counterions in solution. In contrast, DNA bending and torsional rigidities are ionic-strengthindependent within a wide range of ambient conditions (see also Section 8.2). For the theoretical treatment of statistical-mechanical properties of DNA within the elastic-rod model, a Metropolis-Monte-Carlo-type approach was elaborated by Frank-Kamenetskii et al. (1985a). In this approach, the DNA chain is modeled as a series of straight segments so that each Kuhn length contains k such segments. The total elastic energy is the sum of terms, each of which corresponds to a pair of adjacent straight segments and quadratically depends on the angle between them (see Frank-Kamenetskii et al., 1985a; Vologodskii and Frank-Kamenetskii, 1992 for details). The final results are obtained, within the framework of the model, as asymptotic ones for the large k values. Fortunately, all characteristics we studied leveled off very quickly with increasing k so that k = 10 proved to be a quite sufficient value to get very reliable quantitative asymptotic results (see Fig. 6). Asymptotically, this model corresponds to the elastic-rod model of the polymer chain (it is also often referred as the worm-like model; see, e.g., Grosberg and Khohklov, 1994). 4.2. Linear DNA DNA is a unique object for experimental studies of a virtually ideal macromolecular coil. In addition to the already mentioned fact of an exceptionally high a/d ratio, DNA samples are strictly 24 M.D. Frank-Kan?enet.rkii/ Physics Reports 288 11997) 13-60 2- 5 I 10 -1 k 15 20 Fig. 6. Typical results of MetropolissMonte Carlo calculations on the dependence on the number of straight segments per Kuhn length, k, of a mean quantity (the mean writhing number, see Section 4.3.2.1, in the particular case) for a closed polymer chain. The data are from Vologodskii and Frank-Kamenetskii (1992). monodisperse and the length of the molecule can be varied in a very wide range: From below one persistence length up to hundreds of persistence lengths. Moreover, recently developed techniques make it possible to perform quantitative studies of single DNA molecules (Smith et al., 1992, 1996; Strick et al., 1996). In particular, Smith et al. (1992) performed remarkable measurements of strain/extension relationship on single DNA molecules. Bustamante et al. (1994) showed that experimental data agree with theoretical predictions obtained within the framework of the elastic-rod model. After the DNA molecule was fully extended, further increase of force led to a sharp transition to a more extended DNA conformation, in which the average distance between adjacent base pairs was 1.6 times larger than in the normal B-DNA (Smith et al., 1996; Strick et al., 1996). Normally, linear DNA is in the B-form. Numerous studies have made it possible to determine an accurate value of the DNA persistence length, a, which proved to be very close to 50nm. Therefore, the Kuhn statistical length for DNA is equal to 100nm. 4.3. DNA topology It was unexpectedly found in 1963 that DNA exists in certain viruses in a closed circular (cc) form. In this state, the two single strands of which the DNA consists are each closed on themselves. Fig. 7 schematically illustrates ccDNA. One can see that the two complementary strands in ccDNA proved to be linked. They form a high-order linkage (of the order of Nl;I,, where N is the number of pairs in the DNA and y. in the number of base pairs per turn of the double helix). Initially, the discovery of circular DNA was not seen to be very significant, since this form of DNA was regarded as exotic. However, in the course of time, the cc form of DNA was discovered in an even greater number of organisms. Currently, it is generally acknowledged that precisely this form of DNA is typical of the simplest DNAs, and also of the cytoplasm DNAs of animals. Also most virus DNAs pass through a stage of the cc form in the course of infection of cells. M.D. Frank-Kamenetskii I Physics Reports 288 (I 997) 13-60 Fig. 7. In a circular closed DNA, two complementary 25 strands form linkage of a high order. The discovery of ccDNA has led to the formulation of fundamentally new problems, since it turned out that many of the physical properties of the closed circular form differ radically from those of the linear form. The difference between the properties of these two forms of DNA is not at all due to the existence of end effects in the one case but not in the other. There are two levels of DNA topology. First, ccDNA as a whole can be unknotted (form the trivial knot, or unknot) or form knots of different types (see Fig. 8). Secondly, two complementary strands in DNA are linked with each other topologically (Fig. 7). 4.3.1. Knots The first problem that arises in theoretical analysis of ring polymer chains, including ccDNA, is formulated in the following way. Let a ring molecule be formed by fortuitous closure of a linear molecule consisting of n segments. What is the probability of forming a knotted chain, i.e., a nontrivial knot? This problem has been clearly formulated by Max Delbriick and solved by our group (Vologodskii et al., 1974; Frank-Kamenetskii et al., 1975). 4.3.1.1. Statistical mechanics of knots. To solve the problem of statistical mechanics of knots, one needs, first of all, a knot invariant. Indeed, a closed chain can be unknotted or can form knots of different types. The very beginning of the table of knots is shown in Fig. 8. However, an analytical expression for the knot invariant is unknown. Therefore, we had to use a computer and an algebraic invariant elaborated in the topological theory of knots. We found that the most convenient invariant was the Alexander polynomial (reviewed by Frank-Kamenetskii and Vologodskii, 1981; Vologodskii and Frank-Kamenetskii, 1992). The next problem consisted in generating closed polymer chains. In our first calculations, we simulated DNA as a freely-joint polymer chain. Several methods exist to generate exclusively closed chains for this model (Frank-Kamenetskii and Vologodskii, 1981; Vologodskii and FrankKamenetskii, 1992). Using these methods and teaching the computer to calculate the Alexander M.D. Frank-Kanwnrtskiil 26 8 I7 Phy.Crs Reports 288 (1997) 13-60 82~ ‘18 Fig. 8. Knots. polynomials and therefore to distinguish the knots of different type, we could calculate the knotting probability. Analogous calculations have been performed later by other researchers (reviewed by Frank-Kamenetskii and Vologodskii, 1981). The data on the relationship between the probability of knot formation and the number of Kuhn lengths in the chain are collected together in Fig. 9. We see that the results obtained by various authors agree very well with each other. This is not surprising, since, in spite of a certain difference in the polymer models employed, to which certain differences in the results are due, the presented data in all cases fit the model of an infinitely thin polymer chain. One can see from Fig. 9 that the probability of knot formation has an evident tendency to approach unity as y1increases, though it was possible to perform the calculations only up to n values such that P barely exceeds 0.5. Very recently, these calculations have been significantly extended using Vassiliev invariants of knots (Deruchi and Tsurusaki, 1993a, b, 1994). These authors extended calculations up to n = 1600. Remarkably, the data are well approximated by a simple equation: P(n) = 1 - exp(-k-n) where ~=3 x 10d3. , M.D. Frank-Kamenetskiil Physics Reports 288 (1997) 13-60 27 Fig. 9. Probability of knot formation, P, as a function of the number n of Kuhn statistical lengths for an infinitely thin polymer chain. Different symbols correspond to results obtained by different authors (the data from Frank-Kamenetskii and Vologodskii, I98 I ). The above calculations were performed under the assumption that the polymer chain under consideration has zero diameter. In the very early stage of our study of knots we already realized that the excluded volume effects should significantly decrease the knotting probability (Vologodskii et al., 1974, Frank-Kamenetskii et al., 1975). However, the knotting probability proved to be even more sensitive to the excluded volume effects than we originally anticipated so that these effects could not be neglected even in the case of DNA. We arrived at this conclusion using the Metropolis-Monte Carlo approach to calculate DNA topological characteristics within the framework of the elastic-rod model (Frank-Kamenetskii et al., 1985a). This approach made it possible to simulate the behavior of DNA molecules allowing for excluded volume effects (Klenin et al., 1988). So we arrived at quantitative predictions about the dependence of knotting probability on the DNA effective diameter, d. Fig. 10 shows the results. One can see a dramatic dependence of the P value on d. Even in case of DNA geometric diameter, which corresponds to d = 0.02 in Fig. IO, the knotting probability is already significantly lower than for d = 0. However, in reality the effective diameter of DNA noticeably exceeds its geometric value due to the excluded volume effects, which are determined by the screened electrostatic interactions between highly charged DNA segments (see Section 8). Therefore, the d value can be varied by changing the ionic strength of the solution. Our theoretical predictions have been recently checked experimentally (see Section 4.3.1.2). 4.3.1.2. Knotted DNAs. more than hundred years. has been raised, at least, the discovery of closed As mathematical objects, knots and links have been studied already for The question of possible existence of such topological states in molecules since late-1940s (see Frisch, 1993). It has acquired special interest since circular DNA molecules. The calculations of the probability of knot 28 0.08 0.11 d Fig. 10. Dependence of the equilibrium fraction of knotted molecules on DNA effective diameter, d, for closed DNA containing 14 Kuhn lengths (lower curve), 20 Kuhn lengths (middle curve) and 30 Kuhn lengths (upper curve). The data arc from Klenin et al. (1988). The diameter is measured in Kuhn lengths; so, to obtain the d value in nanometers one has to multiply the figures on the abscissa by the factor of 100. formation upon closing a polymer chain (see Section 4.3.1.1) have posed the problem of the possible existence of knotted DNAs. The results indicated that the equilibrium fraction of knotted DNAs must be appreciable for circular DNAs containing more than about lo3 base pairs (30 Kuhn lengths). In most cases, DNA molecules have even greater length, and the hypothesis has been put forward of the existence in the cell of special mechanisms that prevent the formation of knotted DNAs (Frank-Kamenetskii et al., 1975). In fact, in the course of replication of a knotted chain (at least for some types of knots) the daughter strands cannot separate. That is, the replication of knotted DNAs involves serious problems. Knotted molecules were first detected in preparations of single-stranded circular DNAs after they had been treated under special conditions with a type I topoisomerase (Liu et al., 1976). This was the first case when a knotted molecule was observed. However, the problem of knotting of normal, double-stranded DNAs continued to be very intriguing. It turned out that there is a special subclass of topoisomerases called type 11 topoisomerases, which are capable of untying and tying knots in ccDNAs. Moreover, these enzymes catalyze the formation of catenanes from pairs or from a larger M.D. F~nnk-Kunlc~n~tskii I Physics Repouts 288 (I 997) 13-60 29 number of molecules of ccDNA. Here entire networks are formed, similarly to those observed in vivo in kinetoplasts. In contrast to type I topoisomerases, type II topoisomerases break, and then rejoin both strands of DNA molecules. It has been shown that the enzyme “draws” a segment of the same or of another molecule lying nearby through the “gap” that is formed in the intermediate state between the ends that arise through breakage. Thus, the type II topoisomerases catalyze the process of mutual penetration of segments of the double helix through one another. This process has been elaborated in details by Wang and his collaborators in their remarkable studies of the enzyme by various methods including X-ray crystallography (Berger et al., 1996; Wang, 1996). Consequently, these topoisomerases must lead to the establishment of complete topological equilibrium (i.e., to a distribution of molecules over the topological states that would correspond to freely permeable strands). As we have noted above, DNA molecules need not be very long for a reliable proof of the detection of knotted molecules, but then the fraction of knots, as our calculation showed, must be small. Liu et al. (1980) were able to overcome this contradiction by using topoisomerase II in very large concentrations in which it substantially changed the macromolecular properties of the DNA itself. Moreover, they did not add ATP to the enzyme, which is necessary for its normal operation. Precisely under these extreme conditions, they found even in short DNAs having N = 4.5 x lo3 a considerable fraction of knotted molecules. They were able to detect them initially from the appearance of new bands in the gel electrophoregram that corresponded to a greater mobility. The study of the properties of these fractions by various methods including electron microscopy has made it possible to show that they correspond to knots of various types. If topoisomerase II in the normal amount and ATP were added to a purified preparation of knotted molecules, rapid untying of the knots took place (Liu et al., 1980). That is, the system rapidly relaxes to the equilibrium state for pure DNA molecules, in which, as our calculations predicted, there should be practically no knots for the given length. As to the reasons why the enzyme in high concentration sharply shifts the equilibrium toward knot formation, the most likely explanation is that the protein in high concentration decreases the dimensions of the polymer coil of DNA by changing the character of the interaction of remote segments along the chain. As our calculations showed (Frank-Kamenetskii and Vologodskii, 1981) even a small change in the dimensions of the polymer coil sharply increases the equilibrium fraction of knots. Knotted molecules of DNA (and also catenanes) were obtained also by sophisticated methods employing various enzymes of DNA site-specific recombination (Spengler et al., 1985). Although the above experimental observations did not contradict our theoretical expectations, the question about quantitative validity of the theory remained open. Almost 20 years after we first published theoretical estimations of the probability of DNA knotting (Vologodskii et al., 1974; FrankKamenetskii et al., 1975), quantitative experimental data have been reported (Rybenkov et al., 1993; Shaw and Wang, 1993) which fully agreed with the theory. In these experiments, the equilibrium fraction of knotted DNA molecules at various ionic conditions was quantitatively measured while molecules carrying “cohesive” ends randomly closed, in the absence of any proteins. Comparing the fraction with theoretical predictions of Klenin et al. (1988) the value of DNA effective diameter, d, was determined as a function of salt concentration. The obtained dependence proved to be in complete quantitative agreement with theoretical predictions of Stigter (1977) which were based on the polyelectrolyte theory (see Section 8). M.D. Frunk-Krmlenetskiil 30 Physics Repouts 288 (1997) 13p60 4.3.2. Torus links and ribbons From the schematics in Fig. 7 it is clear that the two complementary strands of DNA form a link, in the topological sense. One can present a table of links similar to the table of knots in Fig. 8 (see Frank-Kamenetskii and Vologodskii, 198 1). However, because the two complementary strands of DNA are attached to each other forming the double helix, the links which DNA can form, belong to a subclass of all possible links. Namely, they form a class of the so-called torus links because the two strands could be put into a torus. For torus links, the well-known Gauss integral, which defines the linking number value, Lk, is a strict topological invariant (see Frank-Kamenetskii and Vologodskii, 198 1). There is another viewpoint on the torus links. The two strands in this case could be treated as the edges of a ribbon. Therefore, the topological theory of torus links is actually the theory of ribbons. 4.3.2.1. DNA supercoiling. The application of the topological ideas to studying the properties of ccDNA was started by Fuller (197 1) when he applied the results of the ribbon theory to analyzing the properties of these molecules. According to this theory (White, 1969; a simple derivation can be found in Frank-Kamenetskii and Vologodskii, 1981), besides the topological characteristic of a ribbon, the Lk value, two differential-geometric characteristics play an important role, the twist, Tw, of the ribbon, and its writhing, WY. All three characteristics are interrelated by the condition: Lk = Tw + Wr . The ccDNA is generally not characterized turns (the number of supercoils 7): r=Lk-N/ye. (1) by the total quantity Lk, but by the number of excess (2) The number of base pairs per turn of the double helix, yo, is rigorously fixed under given ambient conditions. However, upon changing the ambient conditions (temperature, composition of solvent, etc.), it can vary. Therefore, the number of supercoils z, in contrast to the Lk value, is a topological invariant of DNA only under fixed ambient conditions. Very valuable information on the energy and conformation characteristics of ccDNA has arisen from experiments in which the value of Lk could vary, and the equilibrium distribution of the cc molecules over the Lk value was studied. The most convenient way to vary Lk is to employ special enzymes we have already mentioned above, the topoisomerases. The studies under discussion employed type I topoisomerases, which alter the topological state of ccDNA by breaking and rejoining only one of the strands of the double helix. The mechanism of action of these enzymes has recently been elaborated in great details (Wang, 1996). These enzymes relax the distribution of the molecules over the Lk value to its equilibrium state. The very sensitive gel-electrophoresis method was used to analyze the distribution of the ccDNA molecules over the Lk value. This method can easily separate two molecules of ccDNA that differ in Lk just by one (see Section 5.1.3). Naturally, the maximum of the equilibrium distribution always corresponds to z = 0 because this minimizes the elastic energy. Note that, although the quantity z can only adopt discrete values that differ by no less than unity, it is not required to be an integer. Therefore, as a rule, molecules having z = 0 do not appear in a preparation. A distribution, in which the molecules having positive and negative values of z are separated, is obtained when the electrophoresis is performed under M.D. Frunk-Kamenetskiil Physim Reports 288 (1997) IS-60 31 conditions differing from those under which the reaction with the topoisomerase is conducted. The change in the conditions means that we must substitute some other value 7; instead of y. in Eq. (2) without changing the Lk value. This means that the entire distribution is shifted by the amount of 67 = N[( llyo) - (l/$,)]. Then the molecules that had the z value in the original distribution will have the values z’ = z + 67 in the new distribution. If the value 67 is large enough, all of the topoisomers are well separated. Experiments have shown that the obtained distribution is always normal (Depew and Wang, 1975; Pulleblank et al., 1975; Horowitz and Wang, 1984). The variance, (TV), of this normal distribution was measured for different DNAs. These experiments have played a very important role in studying the physical properties of ccDNA. They made it possible to determine the free energy of supercoiling, which is directly connected to the variance: F=kgTT2/2(x2) = llOOksT where kB is the Boltzmann N-’ r2, constant, (3) T is the absolute temperature. 4.3.2.2. Theoreticat understanding of DNA supercoiEiny. Quantitative explanation and prediction of a variety of DNA topological characteristics, most notably the data on the equilibrium knotting probability and on the equilibrium distribution of ccDNA over topoisomers (see above and Section 7.1), demonstrated a remarkable success of the DNA elastic-rod model. The model also proved to be extremely successful in theoretical treatment of the phenomenon of DNA supercoiling. In its traditional form, the Monte Carlo approach does not permit simulating highly or even moderately supercoiled molecules because the probability of their occurrence due to thermal motion is negligible. We have extended our Metropolis-Monte Carlo calculations (Frank-Kamenetskii et al., 1985a) to make it possible to generate supercoiled DNA molecules with arbitrary supercoiling (Klenin et al., 1991; Vologodskii et al., 1992). In brief, our computational procedure is as follows (the method is described at length by Vologodskii and Frank-Kamenetskii, 1992). We consider a phantom closed chain, in which self-intersections are allowed. Elementary steps to change the conformations are introduced. After each elementary step, the energy is calculated: Eg(+I)=E(P}) + 2E*(C/hN)[r - JW{~)>12> (4) where E({r}) is the elastic energy of the DNA chain, h is the distance between adjacent base pairs along the DNA axis. Then the regular Metropolis-Monte Carlo rules are applied: if the energy difference between the step under consideration and the previous energy AEg < 0, then the new conformation is accepted; if AE, > 0, the new conformation is accepted with the probability of exp( -AE,,ksT). However, this is only a conditional acceptance. The new conformation needs to meet two additional criteria. First, of all possible pairs of the straight segments none could approach each other closer than a distance d. Secondly, the chain should remain unknotted as a result of conformational change. The knot checking procedure is carried out as in the case of knotting probability calculations described in Section 4.3.1.1. An ensemble of chains thus generated is used to calculate different averaged characteristics of supercoiled molecules and enables one to obtain theoretical images of supercoiled molecules. Fig. 1 I presents examples of such images. Our theoretical predictions about the shape of supercoiled DNA molecules agree with most available experimental data. 32 M.D. Frank-Kamenetskiil Physics Reports 288 (1997) 13WX Fig. 11. Results of computer simulations of supercoiled a = rye/N. The data are from Klenin et al. (1991). DNA molecules for different values of superhelical density M.D. Frank-Kamenetskiil Physics Reports 288 (1997) 13-60 33 Marko and Siggia ( 1994, 1995) developed an approximate analytical theory describing the structures of supercoiled DNA molecules. This theory provides insight into the role of entropic effects in the shapes of supercoiled DNA molecules of the type shown in Fig. 11. 4.4. Breakdown of the elastic-rod model: DNA unusual structures induced by supercoiling With increasing negative supercoiling, the elastic-rod model breaks down. This happens when the elastic energy stored in the form of bending and torsional deformations exceeds the energy necessary for local formation of unusual DNA structures. These unusual structures release superhelical stress thus decreasing the total energy of the molecule. The competition between different unusual structures for the total pool of the superhelical energy dramatically depends on the presence of special sequence motifs, which favor various unusual structures. Before these unusual structures (cruciforms, Z-DNA, H-DNA) were discovered, the main reason for breakdown of the double helix was believed to be the local melting (separation of DNA complementary strands, see Section 6). Anshelevich et al. ( 1979) and Vologodskii et al. (1979b) were the first to include DNA melting and cruciform formation into comprehensive statistical mechanical treatment of supercoiled DNA. As other unusual structures emerged and their energy parameter became available, the treatment has been modified accordingly (Vologodskii and Frank-Kamenetskii, 1982; Frank-Kamenetskii and Vologodskii, 1984; Vologodskii and Frank-Kamenetskii, 1984; Anshelevich et al., 1988). These unusual structures are briefly described below. 4.4.1. Z-DNA Negative supercoiling mostly favors formation of left-handed Z-DNA (see Section 2.4) because, in this case, the maximal release of superhelical stress per base pair adopting a non-B-DNA structure is achieved. As a result, although under physiological ambient conditions the Z form is energetically very unfavorable as compared with B-DNA, it is easily adopted in negatively supercoiled DNA by appropriate DNA sequences (with alternating purines and pyrimidines). Linear DNA with the appropriate sequence adopts the Z conformation at a very high salt concentration (about 3 M NaCl). 4.4.2. Cruciforms Another structure readily formed under negative supercoiling is cruciform, which requires palindromic regions (see Fig. 12). To form a cruciform, a palindromic region should be larger than a certain minimum. For example, six-base-pair-long palindromes recognized by restriction enzymes do not form cruciforms under any conditions. 4.4.3. H-DNA H-DNA forms a special class of unusual structures, which are adopted under superhelical stress by sequences carrying purines (A and G) in one strand and pyrimidines (T and C) in the other, i.e. homopurine-homopyrimidine sequences (reviewed by Mirkin and Frank-Kamenetskii, 1994; Frank-Kamenetskii and Mirkin, 1995; Soyfer and Potaman, 1996). The major element of H-DNA is a triplex formed by one half of the insert adopting the H form and by one of the two strands of the second half of the insert (Fig. 13). Two major classes of triplexes are known - pyrimidine- 34 M.D. Frank-Kamenetskiil Physics 3’1--: AiiciAbAi&i I+*..+.*** TCCTTCTCCC LA7 A a.1 RGGFIFIGRGGG l-.P l-.D 0.0 0.0 H-y3 t.9 4.-i 0.0 On a.+ v.0 .-.CAAAC . ..j ii ij 13-60 TCCTTCTCCC S’-- 0 w.0 W’O Reports 288 (1997) 3’ AAAAAT iiiii$... 5’ .. CCCTCTTCCT-3’ W.0 l-,D u.0 -~~~?~CTTCCT - H-y5 Fig. 12. A cruciform formed in ColEl DNA when the molecule is in a superhelical state. Fig. 13. H form structure of DNA. Two possible “isomer%? variants of the structure are shown. The Watson-Crick pairing is designated by filled circles, while the GC Hoogsteen pairing, involving the presence of an extra proton, is designated by plus symbols. purine-pyrimidine (PyPuPy) and pyrimidine-purine-purine (PyPuPu). Fig. 3 shows the canonical base-triads entering these triplexes. Always two isomeric forms of H-DNA are possible, which are designated as H-y3, H-y5, H-r3 and H-r5, depending on which kind of triplex is formed and which half of the insert forms the triplex (see Fig. 13). H-DNA may be considered as an intramolecular triplex (it is often referred to in this way). Its formation under physiological ambient conditions occurs only under superhelical stress. The discovery of H-DNA (Lyamichev et al., 1985, 1986; Mirkin et al., 1987; reviewed by Mirkin and Frank-Kamenetskii, 1994; Frank-Kamenetskii and Mirkin, 1995; Soyfer and Potaman, 1996) stimulated studies of intermolecular triplexes, which may be formed between homopurinehomopyrimidine regions of duplex DNA and corresponding pyrimidine or purine oligonucleotides (see Section 2.6). 5. Special methods In this section we consider the most important methods specially developed to study DNA. These methods have been introduced relatively recently (in the past twenty years) but they in many cases hf. D. Frank-Kumenetskiil Physics Reports 288 (1997) 13-60 35 have pushed aside the traditional methods. They are widely used in genetic engineering and biotechnology. But they have also proved to be extremely useful tools in the field of DNA biophysics. 5.1. Gel electrophoresis Gel electrophoresis is a simple technique, introduced in early 1970s which truly revolutionized the studies of DNA and, subsequently, the whole field of molecular biology and biotechnology. It is extensively used in biophysical studies of DNA. All experimental developments in this field in the past twenty years are connected, directly or indirectly, with the gel electrophoresis method. Gel electrophoresis has pushed aside ultracentrifugation as a method to separate the DNA molecules. 5.1. I. Background Gel electrophoresis differs from electrophoresis in solution only in the nature of the medium in which molecules are separated by the electric field. In case of gel electrophoresis, the medium is a gel, a polymer network. The most popular in the field of DNA are gels made of polyacrylamide or agarose. Originally, the great advantage of using gels in separating DNA molecules was discovered purely empirically. The understanding came later after some ideas of P.-G. De Gennes were borrowed from polymer physics, namely, the notion of reptation of polymer molecules (see, e.g., Grosberg and Khokhlov, 1994). As in regular electrophoresis, the electrophoretic mobility, ,,Y, is defined as the proportionality coefficient between velocity of movement, v, and the electric field, E: v=pE. (5) The electric force applied to DNA of length L is proportional a negative charge). Hence ~KLD, to L (because each residue carries (6) where D is the diffusion coefficient: D = (X*)/T , (7) z is the characteristic time of a DNA mean-square shift of the molecule after All movements other than within the experiences the Brownian motion only friction in the course of such movement rxLL2=L3. For the ideal polymer molecule to go out of its original “tube” and (x2) is the it went out of the tube. “tube” are forbidden in the gel. As a result, the molecule along its own axis (i.e., within the “tube”). Because the is proportional to L, (8) coil, (x2) CKL and we finally obtain: rux l/L, (9) 36 M.D. Funk-Kammetskiii Physics Reports 288 (1997) 13p60 whereas without the gel similar consideration would lead to the lack of dependence of p on L. Eq. (9) explains why the gel is so efficient a medium to separate DNA molecules according to their lengths during electrophoresis. The consequences of Eq. (9) are really far-reaching. The entire idea of genetic engineering, i.e., reshuffling of DNA pieces extracted from different organisms, has become feasible only after two major breakthroughs: the discovery of restriction enzymes, which cut long DNA molecules into shorter pieces recognizing special short nucleotide sequences, and the implementation of gel electrophoresis to separate the pieces obtained after cutting. Each piece forms its own band in the gel after the electric field is switched off. Then the gel is cut by an ordinary razor to obtain one unique piece of DNA. Restriction enzymes and gel electrophoresis made it possible to obtain samples of absolutely identical DNA molecules of practically any length for biophysical studies. 5.1.2. Pulsed-field gel electrophoresis Even gel electrophoresis has its limitations. According to Eq. (9), with increasing length of molecules their electrophoretic mobility decreases. Therefore, to separate very long DNA molecules in a practically acceptable time scale, one needs to increase the electric field. However, the treatment in Section 5.1.1 is valid only for the case of very low electric fields, which do not deform the molecules (otherwise Eq. (6) would fail). At high fields, the DNA molecule straightens along the electric field. As a result, it moves not like a polymer coil but like a rod-like, straight object. The electrophoretic mobility of such a molecule does not depend on its length independently of whether it moves in pure solvent or in gel, because, in this case, the friction and the driving force are both proportional to L. Thus, in a strong field, separation with respect to length occurs only for a short time before DNA molecules are straightened. It is totally senseless to conduct electrophoresis longer than this time because all molecules, independently of their length, will just shift by the same distance. Does this mean that long DNA molecules cannot be separated in gel? Schwartz and Cantor (1984) found a simple way out of the deadlock. If, soon after molecules are straightened, the direction of the electric field is significantly changed, then, before the molecules are straightened in the new direction, they assume again the shape of a polymer coil and will be separated for the same short time as while moving in the first direction. After straightening in the second direction, the field again is switched to the first direction, etc. As a result of such cycles or pulses, the molecules effectively move in the diagonal direction and the separation takes place throughout the duration of the experiment. Pulsed-field gel electrophoresis dramatically increased the range of lengths of DNA molecules that can be separated in gels. The method makes it possible to separate entire chromosomal DNA molecules. Implementation of this method opened the way to such ambitious projects as the Human Genome Project, which is designated to sequence the entire human genome. 5.1.3. Separation of DNA of difSerent topological jbrms Gel electrophoresis created a kind of revolution in the field of DNA topology and supercoiling. Although for a different reason than linear DNA, closed circular DNA molecules belonging to different topological classes move in a gel with different velocities. As a result, knots of different types can be separated in a gel (Rybenkov et al., 1993; Shaw and Wang, 1993). The same is true for different topoisomers, ccDNA molecules differing in the linking number Lk (see Fig. 14). It is not the Lk M. D. Frank-Kanzenetskii I Physics Reports 288 (1997) 13MXI 31 Fig. 14. Separation of DNA molecules differing by the number of superhelical turns, done by the gel electrophoresis technique. The experiment was conducted with DNA of a small pA03 plasmid, containing 1683 nucleotide pairs. Originally, the molecules were put from the top, near the negative electrode (the place is not shown in the figure). Fig. 15. A typical pattern of two-dimensional in DNA. gel electrophoresis, observed during the formation of an unusual structure value according to which DNA molecules are actually separated, but the writhing number, Wr (see Section 4.3.2.1). More precisely, the mobility depends on the absolute value of writhing, 1Wrl. 5. I. 4. Two-dimensional gel elec trophoresis The ordinary, one-dimensional gel electrophoresis does not separate supercoiled molecules that have the same absolute value of the number of superhelical turns, r, but different sign because those molecules have the same absolute value of writhing, 1Wrl. When a structural transition into an unusual structure occurs, although the Lk value does not change, both twisting and writhing change (their sum, which is the linking number, remaining unchanged). As a result, a topoisomer carrying an unusual structure may move in a gel with the same speed as another topoisomer without an unusual structure. To avoid such confusion, two-dimensional gels are used. A specially prepared mixture of different topoisomers of one and the same DNA, carrying an insert capable of changing into an alternative structure, is placed in the left top angle of a quadrangular gel plate (see Fig. 15). Then an electric field is applied to force DNA molecules to move from top to bottom along the left edge of the plate. Following the separation of topoisomers in the first direction, the gel is saturated with chloroquine molecules, which lessen the superhelical stress. The 38 M. D. Frclnk-Kumc~nrtskii I Physics Reports 288 (I 997/ 13%50 chloroquine concentration is chosen in such a way as to make the superhelical stress insufficient for the formation of an unusual structure. After that, the direction of the electric field is changed to force the molecules to move from left to right. As a result, the sequence of spots in the second direction corresponds to the topoisomers’ sequence. The uppermost spot in Fig. 15 corresponds to zero topoisomer, i.e., to a relaxed and nonsuperhelical DNA. The spots coming clockwise from that correspond to positive topoisomers; those going anticlockwise, to the negative. One can clearly see the mobility drop, observed in this case between - 10 and - 12 topoisomers. This means that in topoisomers - 12, - 13,. . . , an unusual structure is present, while in topoisomers . . . , -9, - 10 it is absent. Topoisomer - 11 occupies an intermediate position: it carries the unusual structure during, roughly, half the time of its movement in the gel. 5.2. Chemical, photochemical and enzymatic probing A large variety of special approaches have been attempted to study DNA structures. They are based on different reactivity of DNA adopting different structures with respect to chemical, photochemical and enzymatic reactions. In many cases, these methods make it possible to arrive at very specific conclusions about the structure of a particular region of DNA under conditions that totally exclude application of not only X-ray crystallography or NMR but even spectroscopy and other indirect physical methods. Sometimes, the methods under consideration are applicable even in vivo. To explain the general ideology underlying these methods, let us consider a specific example. In Section 2.6 we mentioned intermolecular triplexes, which are formed between homopurine-homopyrimidine regions of DNA and the corresponding homopurine or homopyrimidine oligonucleotides. If such a complex is actually formed, the reactivity of the N7 position of guanine (this is one of the two nitrogens in the five-member ring of guanine) should dramatically decrease because this nitrogen is sheltered in the triplex by the Hoogsteen pairing (see Fig. 3). The chemical agent used is dimethyl sulfate (DMS), which reacts with the N7 position of guanine alkylating it. This alkylation occurs in single-stranded as well as in duplex nucleic acids. However, it cannot take place in triplexes. As a result, in the complex of duplex DNA with oligonucleotide, which forms a triplex, all guanines in the duplex outside the triplex zone will be alkylated by DMS, whereas guanines within the triplex zone will remain unmodified. Then the DNA piece under study is end-labeled and subjected to hot piperidine treatment. Piperidine will convert the sites of alkylated guanines into chain breaks. Such breaks will never occur in the triplex zone. Fig. 16 shows the pattern that is obtained after separation of the fragments in gel and radioautography. The above example is a specific case of the footprinting assay. Such assays can be applied, for instance, to complexes of DNA with proteins to find out which sequences are recognized by the proteins. Instead of DMS, DNAase I, which cuts the uncovered DNA duplex, is often used. The yield of some photoproducts, which can also be converted to strand breaks, dramatically decreases when a duplex region is covered by a protein or an oligonucleotide. Hence the photofootprinting assay is useful (Lyamichev et al., 1990, 1991). Some chemical reagents, like diethyl pyrocarbonate, potassium permanganate, osmium tetroxide, do not react with bases in the double helix but react with open bases. The products can be converted into chain breaks. These reagents are widely used to detect open regions. Single strand-specific nucleases, which digest single strands but do not digest duplex, are used in a similar way. M.D. Frank-Karnenetskii I Physics Reports 288 (1997) 13-60 39 Fig. 16. The result of footprinting experiment with dimethyl sulfate of a complex of duplex DNA carrying homopurine-homopyrimidine insert with corresponding pyrimidine oligonucleotide. The data are from Chemy et al. (1993a). Chemical, photochemical and enzymatic probing is an extremely powerful method to detect unusual structures, like Z-DNA, cruciforms, H-DNA, G-quadruplexes. 6. Melting of DNA Soon after the discovery of the double helix by Watson and Crick (1953) the phenomenon of DNA melting was demonstrated experimentally. It was shown that when the DNA solution is heated, 40 M.D. Frunk-Kamenetskii I Physics Reports 2118 (1997) 13-60 (a) 64 66 68 70 72 74 76 78 80 Temperature (b) Fig. 17. Melting of DNA. (a) The helix-coil transition of a DNA molecule (intramolecular melting). (b) Typical DNA melting profile. This curve is also often called the differential melting curve. The curve was obtained for DNA which has the code name of ColEl and contains about 6500 nucleotide pairs. the complementary strands separate: instead of the regular double helix two single-stranded DNA coils emerge (Marmur and Doty, 1962). This phenomenon is also called the helix-coil transition. The DNA melting may be monitored by various techniques. Two most popular methods are UVspectrometry (see Section 3.4.1) and microcalorimetry (reviewed by Breslauer et al., 1992). Instead of exhibiting a phase transition, DNA melts gradually, in a wide temperature range (Fig. 17). DNAs from different organisms differ in their melting profiles. M.D. Frank-Kamenetskiil 6.1. Helix-coil Physics Reports 288 (1997) 13S60 41 model In attempts to understand the phenomenon of DNA melting, a simplified theoretical model was elaborated (see Fig. 5) which treated DNA as a one-dimensional array of interacting spins. Each spin corresponded to a DNA base pair. Spin up corresponded to the helical state while spin down corresponded to the melted (open) state of the base pair. Two features made the problem much more difficult and much more interesting than the one-dimensional Ising model well known in the solid state physics. First, because open regions in DNA presented closed polymer chains, a longrange interaction between spins emerged. Secondly, because two base pairs in DNA (AT and GC) have different stability, DNA had to be modeled as a linear array of spins under the influence of disorder external magnetic field. Although irregular, the sequence is fixed so that the external field is quenched. Therefore, the system is equilibrated with respect to the direction of spins (up and down) but not with respect to the field (base pairs AT and GC). I.M. Lifshitz labeled such systems as having linear memory. This second feature of the DNA helix-coil model presented a major challenge to theorists and attracted considerable attention in 1960s and 1970s. Like DNA topology, DNA melting belongs to biophysical problems, which are sometimes labeled as “biologically inspired physics” (Peliti, 1990). It is worth to mention that knots first emerged in the Russian biophysics community not in connection with circular DNAs but in connection with closed circles of single-stranded DNA formed in the process of DNA melting. I believe this question first attracted the attention of a wide audience during I.M. Lifshitz’s brilliant, I would even say charismatic, lecture at one of the regular Winter Schools on Molecular Biology in Dubna near Moscow (I guess it was in 1969). Speaking about the possibility of diffusion of knots from the ends of linear DNA in the process of melting, I.M., by a perfectly theatrical gesture, took out the belt from his pants and tied it into the trefoil knot in front of a stunned audience of about 500 Russian molecular biologists and biophysicists. In literature, the possible topological effects due to circular nature of DNA melted regions were first discussed by Shugalii et al. (1969) and Vedenov et al. ( 197 1). 6.1. I. Theoretical development In statistical-mechanical terms, the second feature of DNA helix-coil model (the linear memory due to the fixed sequence of DNA base pairs) means that one cannot average the partition function over different sequences of AT and GC pairs even if one assumes that the sequence itself is totally random. In reality, of course, the sequence is not random because it carries the genetic information. However, at early stages of treatment of the DNA melting phenomenon, long before the first real DNA sequences became available, the sequence was assumed to be random in theoretical studies. This made it possible to apply not only numerical but also analytical tools to treat the problem. The most elegant analytical approach was proposed by Lifshitz (1973). Among others, important contributions of Vedenov et al. ( 1971) and Azbel ( 1972) are worth mentioning. As to the numerical solution, the challenge was to reduce the problem of direct computation of the partition function for a chain consisting of a very large number (N) of base pairs (“spins”), which required exponentially large computer time, to a procedure, which required polynomial time N” with as small an a as possible. Several rigorous algorithms were proposed (Vedenov et al., 1967, 1971; Poland, 1974). 42 M.D. Frank-Kanzenetskii I Physics Reports 288 (1997) 13-60 However, an efficient way of solving the problem, which allowed for both the above features of the DNA helix-coil model, was not available until Fixman and Friere (1977) proposed their algorithm. In so doing they heavily relied on the Poland (1974) algorithm and some of our results (Frank-Kamenetskii and Frank-Kamenetskii, 1969; Lukashin et al., 1976). Theoretical development of the helix-coil model has been extensively reviewed by Vedenov et al. (1971), Wada et al. (1980), Wada and Suyama (1985), and Wartell and Benight (1985). It is worth mentioning that the helix-coil model without long-range interactions found applications far beyond the area of DNA biophysics. Among other applications, the model has been extensively used to study cx-helix-coil transition in polypeptides and most recently it was used by Selinger and Selinger (1996) to explain experimental data on chiral order in random copolymers consisting of two enantiomers. 6.1.2. Comparison with experiment When the very first full DNA sequence appeared in 1977 (of bacteriophage #X174), DNA biophysicists were well equipped to compare quantitatively experimental DNA melting profiles with theoretical predictions. It was first done by Lyubchenko et al. (1978). Essentially, it was the beginning of the end of the theme of DNA melting in DNA biophysics because theoretical prediction correlated with experiment sufficiently well. Even more direct comparison was done by Kalambet et al. (1985) using electron-microscopy visualization of the melted regions in DNA with the known sequence on different stages of the melting process. Such comparisons and similar studies (reviewed by War-tell and Benight, 1985; Wada and Suyama, 1986) left no doubts that we correctly understood in quantitative terms major features of the phenomenon of DNA melting. 6.1.3. Heterogeneous stacking A theme that dominated the field after the first demonstration of a success of the theory in achieving quantitative explanation of experimental data for DNAs with known sequences, was the so-called heterogeneous stacking. In the original helix-coil model, the external field could acquire only two values, corresponding to AT and GC pairs. This meant that interaction between all possible combinations of near neighbors along the DNA chain was assumed to be the same. Of possible 16 types of nearest neighbors, or stacks, only 10 are different because of the complementarity rule. It was quite natural to attribute some remaining differences between theory and experiment to the fact that these 10 parameters are different, i.e., to the effect of heterogeneous stacking. However, the very fact that the original model, which ignored the difference, worked well, indicated that the deviations from the mean interaction energy between adjacent base pairs were small as compared with the energy itself. In other words, these data indicated that the heterogeneous stacking was a small parameter. In the first paper where heterogeneous stacking was allowed for, Gotoh and Tagashira ( 198 1) overlooked the fact that of 10 parameters of heterogeneous stacking only 8 of their combinations (invariants) actually determine the behavior of long DNA chains. When they adjusted all 10 parameters of heterogeneous stacking by comparing theory with experiment, a great confusion occurred because, unexpectedly, the effect of heterogeneous stacking proved to be very large. Vologodskii et al. (1984) dispelled the confusion adjusting 8 invariants, not 10 parameters, by comparing theory with experiment. As a result, a reasonable set of relatively small parameters of heterogeneous stacking M.D. Frank-Kamenetskii I Physics Reports 288 (I 997) 13-60 43 emerged (Vologodskii et al., 1984). Although some uncertainty in the exact values of parameters of heterogeneous stacking still remains (Doktycz et al., 1992; Doktycz et al., 1995; SantaLucia et al., 1996), the problem is mostly solved. 6.2. Slow relaxational processes The remarkable success of statistical-mechanical theory in explaining the phenomenon of DNA melting overshadowed some significant limitation of the approach. For a long time experimental observations of hysteresis phenomena in DNA melting were largely ignored. However, when comparison of theory and experiment reached a high precision, kinetic effects in DNA melting could not be ignored any longer. A comprehensive analysis of slow relaxation processes in DNA melting was performed by Anshelevich et al. (1984a). The hysteresis phenomena in DNA melting are a direct consequence of the fact that very long regions are melted out cooperatively in the course of the process. The characteristic time of strands separation for a helical region consisting of m base pairs may be roughly estimated as (see Anshelevich et al., 1984a, for more accurate expressions): where r. x lo-’ s and s is the stability constant for a base pair. Although s is very close to unity within the melting range, because m is several hundreds the S* value may be extremely large. Hence, very large r values and a significant contribution of kinetic effects. Subsequent thorough experimental studies completely confirmed all major theoretical predictions (Kozyavkin et al., 1984,1986; Wada and Suyama, 1986). Concluding this section, we would like to stress that thorough experimental and theoretical studies of DNA melting have led to a virtually full understanding of the phenomenon. It is a rare example of a real insight into a complicated biophysical process. 7. Fluctuations in DNA Although under “physiological” conditions DNA occupies its “ground state”, i.e. it is predominantly in the B form, DNA functioning is impossible to understand without knowledge of its “excited’ states. Transient occupation of these excited states explains all variety of reactions DNA experiences during its functioning. Indeed, fluctuations are responsible for DNA adjustment to a variety of proteins, which interact with DNA, including the key proteins like repressors, activators, PNA polymerase. DNA damage by radiation and chemical agents, including carcinogens and mutagens, is often possible due to fluctuations, which make the reaction groups, normally buried within the double helix, accessible to chemical and photochemical modification. Therefore, the question of most probable DNA excited states is of paramount importance. The answer to this question critically depends on the topological state of DNA. In this section we mostly concentrate on the basic problem about fluctuational motility of the double helix itself, without additional factors like supercoiling. We briefly touch the effect of supercoiling at the end of the section. M.D. Flunk-Kunlrnrtskiil 44 7.1. Bending and torsional jluctuations Physics Rqwrts 288 (19971 13.-60 oj’ the double helix Due to thermal motion, the DNA molecule experiences bending and torsional fluctuations. The amplitude of these fluctuations is determined by the values of DNA bending and torsional rigidities, respectively. We have already discussed the DNA bending rigidity, which is quantitatively characterized by the value of DNA persistence length, a (see Section 4). The value of DNA torsional rigidity can be determined from the data on the variance of the equilibrium distribution of closed circular DNA molecules over the linking number, (r’) (see Section 4.3.2.1). Under equilibrium conditions allowing breaks and rejoining of one of the DNA strands (i.e., in the presence of topoisomerases) LIR, and PI+ are independent random quantities, while the resultant quantity z equals their sum: Evidently, the mean values are (A RY) = ( WY) = (r) = 0. H owever, the quantities ((nnr)‘), and (r’) differ from zero. Then, because of the independence of the random quantities Wr, one obtains (Vologodskii et al., 1979a): (@)= (<nTN’)2)+ (( Wr)2) . ((WY)‘) ~TII: and (12) The quantity (r2 ) is known from experiment, and (( WF)~ ) could be calculated by computer simulation because the ribbon theory offers a simple analytical formula for the value provided that the shape of the chain is known (see Frank-Kamenetskii and Vologodskii, 198 1; Vologodskii and Frank-Kamenetskii, 1992). (Note that in so doing we extensively used our method of discrimination of knotted and unknotted chains, see Section 4.3). To compare the values of (T’) and ((WY)‘) for the same DNA we need to know the quantitative value of the DNA Kuhn length, which is known with a good accuracy to be equal to 100 nm (see Section 4). Therefore, we could find ((A Trr)’ ) as a function of the DNA length by subtracting the calculated ((WY)“) value from the experimental (2’ ) value. On the other hand, ((~Inr)’ ) is directly related to the torsional rigidity of the double helix, C: ((ATw)‘) =N((b#$) =hkBTN/4&, (13) where h and 4 are the distance along the axis and the rotation angle between two adjacent base pairs in the double helix, respectively. In full agreement with this equation, ((ATw)~) proved to be strictly proportional to N. The slope of the straight line made it possible to determine C, which proved to be 3.0 x lo-l9 erg cm (Klenin et al., 1989). This value of the torsional rigidity of DNA corresponds to a root-mean-square amplitude of thermal fluctuations in the value of the angle between adjacent base pairs of 4-5”. The obtained results indicated that in sufficiently long supercoiled DNA one third of the superhelical energy is stored in twisting and two thirds are stored in writhing. Thus, the analysis of the experimental data on circular DNAs employing the topological approach made it possible to estimate one of the fundamental characteristics of the double helix. These estimations agree with the results obtained by other methods (see Taylor and Hagerman, 1990, and references therein). hf. D. F~arzk-Kumenetskiil 7.2. Fluctuational Pl1wic.r Reports 288 (1997) 13&X 45 openings of base pairs Bending and torsional fluctuations in DNA are the result of accumulation of small displacements from equilibrium positions of DNA base pairs. Extrapolating the theory of DNA melting (Section 6) to temperatures well below the DNA melting range one concludes that a tiny fraction of base pairs should be open at physiological temperatures. According to theoretical predictions, the double helix should be interrupted by solitary open base pairs on average every IO5 base pairs (Frank-Kamenetskii and Lazurkin, 1974; Frank-Kamenetskii, 1981, 1985). However, how reliable were the conclusions based on such long extrapolations? Certainly, experimental estimations of this fundamental characteristic of the double helix were needed. However, for many years the question of base-pair opening probability, and the related question of life-time of a base pair in closed state, were the subject of sharp controversy because different methods led to quite different conclusions. The most direct approach consisted in monitoring the kinetics of exchange of exchangeable hydrogen atoms, which are buried within the double helix if the base pair is closed. Such protons can exchange only while the base pairs are open. Such data led to the opening probability, which was three orders of magnitude higher than the above theoretical estimate (Mandal et al., 1979; Cantor and Schimmel, 1980). We used another approach for probing the open base pairs. We analyzed both experimentally and theoretically the kinetics of DNA reaction with formaldehyde. Like in the case of hydrogen exchange, formaldehyde can react with bases only if they are open. Our comprehensive analysis led to an estimation of the base pair opening probability, which was in full agreement with the theoretical prediction of 10P5 (Frank-Kamenetskii, 198 1, 1985, 1987). The controversy was resolved by Gueron et al. ( 1987) who studied the hydrogen exchange kinetics by nuclear magnetic resonance (NMR). They showed that in previous analysis of the hydrogenexchange data the intrinsic catalysis of the exchange had been overlooked. Quite unexpectedly, exchange of exchangeable protons in open based pairs is catalyzed by the complementary bases. This intrinsic catalysis overshadowed the effect of the external catalyst and led to a wrong conclusion that the observed exchange rates were limited by the rate of base-pair opening. An analysis of the NMR data (Gueron et al., 1987; Gueron and Leroy, 1995) gave the estimation of the opening probability as 10P5, in full agreement with our theoretical expectation and the figure followed from the formaldehyde data (Frank-Kamenetskii, 198 1, 1985, 1987). The full picture of internal base-pair fluctuational opening emerged as follows. Base-pair lifetime is 10e2 s, open base-pair lifetime is lo-’ s; and base-pair opening probability is lop5 (Frank-Kamenetskii, 1987; Gueron and Leroy, 1995). Of course, terminal base pairs have a much shorter lifetime of lop6 s. Opening probability dramatically increases in the DNA “premelting” zone. Although well below the melting range base pairs open with very low probability, this fluctuational opening, or DNA “breathing”, plays an extremely important role. These fluctuations make accessible for interaction and chemical reactions the active groups of DNA bases, which are otherwise completely buried within the double helix. DNA modification by formaldehyde still remains the most thoroughly studied case, where DNA breathing plays a crucial role (Lukashin et al., 1976; Frank-Kamenetskii, 198 1, 1985 ). 46 M.D. Frank-Kunzenetskiil Physics Reports 288 (1997) 13L50 7.3. Fluctuutions in superhelical DNA Negative supercoiling (see Section 4.3.2) highly increases the probability of fluctuational formation of different structures, which release the superhelical stress. This happens well below the threshold superhelical density corresponding to the breakdown of the elastic-rod model discussed in Section 4.4. A statistical-mechanical treatment of these fluctuations is presented in Anshelevich et al. ( 1979), Vologodskii et al. (1979b), Vologodskii and Frank-Kamenetskii ( 1982), Frank-Kamenetskii and Vologodskii (1984), Vologodskii and Frank-Kamenetskii (1984) and Anshelevich et al. (1988). In spite of numerous efforts, the possible role of transient formation of various structures induced by supercoiling in the cell still remains to be elucidated. 8. Polyelectrolyte properties of DNA One of the most striking features of the DNA molecule is that each base pair carries two elementary negative charges. As a result, the DNA molecule is characterized by an extremely high linear charge density. This negative charge attracts small cations from solution (usually Na+), which create a positively charged cloud around the DNA chain. Because of the electrostatic interaction between the charged DNA molecule and the cloud of counterions, different important properties of DNA prove to be strongly dependent on the salt concentration (usually NaCl). For many years, the problem of quantitative understanding of DNA polyelectrolyte properties has been presenting a challenge for DNA biophysicists. By early 1980s primarily due to a seminal paper by Fixman (1979) it became clear that, although not rigorous, the Poisson-Boltzmann (PB) equation is applicable for treatment of even a highly charged polyelectrolyte in a wide range of salt concentration (Fixman, 1979; Anderson and Record, 1982; Anshelevich et al., 1984b; Frank-Kamenetskii et al., 1987). As a result, the PB equation has become a major tool for theoretical treatment of DNA polyelectrolyte properties. 8.1. Cylinder model oj’ DNA A simplified model, which has been most popular in the field, treats DNA as a uniformly charged cylinder of diameter dg. (We denote the DNA geometrical diameter by dg to distinguish it from the DNA effective diameter d of the elastic-rod model considered in Section 4). The negative charge of phosphate groups is supposed to be evenly spread throughout the surface of the cylinder (see Fig. 5). The Manning condensation theory (Manning, 1969, 1972; Cantor and Schimmel, 1980), which treats a limiting case of the model corresponding to dp+ 0, has acquired enormous popularity among experimentalists because it led to very definite and simple conclusions. 8.1.1. Manning’s condensation theory This theory models DNA as an infinitely thin charged thread with uniformly distributed linear charge density, 4. According to an idea of Onsager that was followed up by Manning (1969), the 141 value cannot exceed some critical value, lqcl, because if lql> jqcl, the statistical integral for the infinitely thin thread diverges (see reviews by Manning, 1972; Anderson and Record, 1982; Frank-Kamenetskii et al., 1987). The theory predicts the mobile counterions to precipitate on the hf. D. Frunk-Kamenetskiil Physics Reports 288 (1997) 13-60 47 thread decreasing the absolute value of its linear charge density down to the critical value, jqcI. These arguments led Manning to numerous and very handy equations describing the dependence of various DNA properties on ionic strength (reviewed by Manning, 1972; see, e.g., Cantor and Schimmel, 1980). Because the theory proved to be remarkably successful in explaining some DNA properties, it was tempting to believe in the reality of the phenomenon of counterion precipitation, or condensation. However, the PB equation does not predict any separation of the pool of counterions into two distinctive groups or any discontinuity in the counterion concentration around the charged cylinder even in the limiting case of dg + 0. This was generally considered, before early 1980s as a failure of the Poisson-Boltzmann equation because this equation is derived within the selfconsistent-field-theory approximation. After rehabilitation of the PB equation (Fixman, 1979; Anderson and Record, 1982; Anshelevich et al., 1984b; Frank-Kamenetskii et al., 1987) and due to the results by Gueron and Weisbuch (1981), Ramanathan and Woodbury (1982) and Zimm and Le Bret (19X3), the situation was clarified. The separation of the pool of counterions into two distinctive groups or discontinuity in the counterion concentration around the polyion does not actually take place. Instead, all counterions form a cloud around the polyion with the counterion concentration smoothly decreasing with departure from the polyion. Manning’s condensation theory correctly predicts the counterion distribution at veiy large distances from the polyion but gives a totally wrong picture of the counterion distribution in the vicinity of the polyion. This explains why the condensation theory predicts well some of the DNA properties and fails to predict other properties (for more discussion see the review by Frank-Kamenetskii et al., 1987). 8.1.2. Poisson-Boltzmann equation Since during the early 1980s it became clear that the PB equation provided a reliable ground for theoretical studies of polyelectrolytes, the PB equation has become a major tool for theoretical treatment of DNA polyelectrolyte properties. In this section we briefly overview the major results of applying the PB equation to DNA within the framework of the cylinder model. 8.1.2.1. DNA melting. The theory makes it possible to calculate the ionic-strength dependence of the DNA melting temperature (T,,,). To do so, one needs to calculate the electrostatic free energy of DNA surrounded by small ions. We performed such calculations using the cylinder model for both duplex DNA and separated DNA strands (Frank-Kamenetskii et al., 1987). More recently, similar calculations were performed by Bond et al. (1994), who extended them to the case of melting of DNA triplexes. A major conclusion from these calculations is that the ionic-strength dependence of 7’, is very sensitive not only to the linear charge density of the duplex and single-stranded states (that was the case, of course, for the condensation theory) but also to the diameter of the cylinder assumed for both DNA states. Such a parameter just does not exist in the condensation theory. Taking the helix parameter values from the geometry of B-DNA, we adjusted the parameters for single-stranded DNA from the condition of the best fit between theoretical and experimental dependence of T, on ionic strength (Frank-Kamenetskii et al., 1987). The conclusion was that the distance between adjacent phosphate groups in single-stranded DNA is the same as the distance between projections on the DNA axis of adjacent phosphate groups of a DNA strand within the B-DNA duplex. This conclusion was unexpected because, apparently, there are no sterical obstacles 48 M.D. Frank-Kamenetskiil Physics Reports 288 (1997) 13-60 OH n R PNA n H DNA Fig. 18. Chemical structure of peptide nucleic acid (PNA) and DNA. B is one of the four canonical the PNA terminal group. DNA bases, R is for the free DNA single strand to be significantly stretched out as compared to the regular helical form it has within the duplex. It should be emphasized that although Manning’s explanation of the T, dependence on ionic strength was considered as a great success of the condensation theory and found its way into textbooks (e.g., Cantor and Schimmel, 1980) from the theoretical viewpoint the condensation theory is totally inapplicable to calculate the electrostatic free energy (see Frank-Kamenetskii et al., 1987; Lukashin et al., 1991a). Therefore, the PB derivations described above were essentially the first attempt of a consistent theoretical analysis of the problem. Very recently, such analysis was extended to interpret melting experiments for peptide nucleic acid (PNA) and its complexes with DNA. PNA is an artificial molecule, which consists of DNA bases and a neutral polypeptide-like backbone (see Fig. 18; we discuss PNA at greater length in Section 9). Two complementary PNA molecules from PNA/PNA duplex and complementary PNA and DNA strands form PNA/DNA heteroduplex. Tomac et al., 1996 compared the ionic-strength dependence of T, for PNA/PNA, PNA/DNA and DNA/DNA duplexes. While for the PNA/PNA duplex the T,,, value did not depend on salt concentration due to neutrality of the molecule, in the case of PNA/DNA duplexes j”,, slightly decreased with increasing ionic strength. Tomac et al. (1996) successfully explained this effect using essentially the same analysis as was done by FrankKamenetskii et al. (1987) and Bond et al. (1994) for DNA melting and adjusting the only unknown parameter, the diameter of the PNA/DNA duplex. These data, together with the data for DNA triplexes by Bond et al. (1994), demonstrate that the cylinder model in combination with the PB equation proved to be a very efficient tool for quantitative description of a wide variety of melting experiments. 8.1.2.2. B-Z transition. Even before applying the cylinder model to DNA melting, we applied it to the B-Z transition (Frank-Kamenetskii et al., 1985b). Actually, the data on the dependence of the B-Z equilibrium on ionic concentration provided the first clear indication from the experimental side that something was fundamentally wrong with the condensation theory. Indeed, because the linear charge density for Z-DNA is lower than for B-DNA, the condensation theory predicted increasing M.D. Frank-Kamenetskiil Plzysics Reports 288 (1997) 13-60 49 relative stability of B-DNA with respect to Z-DNA with increasing ionic concentration. In reality, high salt stabilizes Z-DNA relative to B-DNA. We showed that the cylinder model and the PB equation resolve this controversy. Our calculations predicted the effect of maximal relative stability of B-DNA at intermediate ionic strength (FrankKamenetskii et al., 1985b). In other words, the electrostatic free energy difference between Z- and B-DNA exhibited a maximum at ionic concentration near 0.1 M of monovalent salt. The effect was sensitive to the specific values of the parameters of the model. It remained unclear whether the effect would remain for more realistic models of charge distribution in B- and Z-DNA. This question was addressed recently by Demaret and Gueron (1993) and Misra and Honig ( 1996) who treated, using the PB equation, more realistic models of the charge distribution. The possibility of existence of Z-DNA at low salt, as well as at high salt, has been extensively studied experimentally by Ivanov et al. (1993). X1.2.3. DNA effeective u’iunzeter. Reliable computations of the screened Coulomb potential around DNA within the framework of the cylinder model and the PB approach, made it possible to calculate the excluded volume effects in DNA due to the electrostatic interactions. Such calculations, first performed by Stigter (1977), yielded a theoretical prediction of the dependence on ionic concentration of the DNA effective diameter, d. These theoretical results proved to be in complete quantitative agreement with experimental data, after the d value was determined from the data on DNA knotting probability (Rybenkov et al., 1993; Shaw and Wang, 1993; see Section 4.3.1.2). 8.2. Other models We have already mentioned more realistic models, which allow for some deviations from the original cylinder model (Demaret and Gueron, 1993; Misra and Honig, 1996). Although these papers consider more realistic charge distributions, they still treat DNA as a stiff construction with straight axis. Le Bret (1982) and Fixman (1982) calculated the electrostatic contribution into the polyelectrolyte bending energy. These data made it possible to explain the experimentally observed dependence of persistence length of various polyelectrolytes on ionic strength (Tricot, 1984). Concluding this section, it should be emphasized that all models treat the water surrounding the DNA as a continuous medium with a fixed value of dielectric permittivity. Lukashin et al. (1991b) considered the consequences of possible deviations from this assumption. We found that the character of dependence of the water dielectric permittivity near the DNA surface may significantly affect the results. This question needs further studies. 9. Specificity of DNA interactions In previous sections, we concentrated on some of traditional problems of DNA biophysics, which have been attracting attention for many years. Some of them (like DNA melting), have been essentially solved, others (like DNA topology or DNA polyelectrolyte properties) although being traditional, still attract a good deal of attention. In this section I consider a problem, which has just emerged as a DNA biophysics problem. 50 M.D. Frank-Kamenetskiil Physics Reports 288 (1997) 13-60 In principle, the problem has long been around in a wider field of the DNA science. This is the problem of sequence-specific recognition of sites on DNA by short single-stranded DNA chains (oligonucleotides) or their synthetic analogs. In the simplest form, this is the problem of “annealing” between a single DNA strand and an oligonucleotide complementary to a short piece of the DNA strand. Such complexing, due to the complementarity principle, is used in numerous techniques, which have wide applications in molecular biology and biotechnology, such as DNA sequencing, hybridization (Southern blotting), polymerase chain reaction (PCR), etc. The central question for all these applications is the relation between specificity and affinity of binding of the oligonucleotide to single-stranded DNA. Similarly, homopurine or homopyrimidine oligonucleotide can bind to duplex DNA via Hoogsteen pairing (see Section 2.6; reviewed by Frank-Kamenetskii and Mirkin, 1995; Soyfer and Potaman, 1996). Finally, an oligonucleotide analog, peptide nucleic acid (PNA), forms the most unusual complex with duplex DNA consisting of two homopyrimidine PNA molecules (see below). Binding of PNA to duplex DNA exhibits a remarkable combination of high specificity and affinity. 9.1. Relationship between specificity and afJinity at equilibrium In the case of equilibrium binding, the problem is a very simple one, at least in principle. Indeed, according to von Hippel and Berg (1986) (we will refer to the paper as vHB) the specificity, a,, may be defined as the ratio of concentration, [PL], of complexes of the ligand, L, with its specific site, P, and the sum of concentrations of complexes, [B,L], of the ligand with all non-specific sites Bi: (14) Simple arguments based on the mass action law led von Hippel and Berg (1986) to the final equation for CI,: (15) where K,_ and Kni are equilibrium specific sites, respectively. constants of binding of the ligand to the specific and ith non- 9. I. I. General considerations Eq. (15) leads to the intuitively obvious conclusion that high selectivity is achieved when KL is much larger than all KS, taken together (under natural assumption that [P] and each of [Bi] are of the same order of magnitude). For the sake of simplicity and without loosing generality, we concentrate on only two types of complex: the specific or “correct” one and a non-specific or “frustrated” complex, which is actually always a representative of a large number of possible non-specific complexes. Within the framework of the vHB treatment, the above two complexes are characterized by two equilibrium binding constants, KL and Kf, respectively. High specificity requires a very strong inequality to be valid: KL+Kf . (16) M.D. Frank-Kammrtskiil Physics Reports 288 (1997) 13-60 51 Thus, within the framework of the vHB consideration, high selectivity of binding may only be the result of a very large free energy gap between the correct and non-specific or frustrated complexes. On the basis of vHB-type analysis, Eaton et al. (1995) formulated an important principle of drug selection, according to which high-affinity binding (i.e., binding with a high value of the equilibrium binding constant) entails highly specific binding. Note that the protein folding process may be considered within the above general approach as an intramolecular self-assembly. Eq. (16) indicates that high specificity of folding requires selection of special protein sequences, for which a large free energy gap exists between folded and misfolded conformations. This conclusion fully agrees with the results of Monte Carlo simulations of simple protein models by Sali et al. (1994). The above consideration and conclusions explain, at least qualitatively, remarkable selectivity of biomolecular interactions in a variety of real situations, both in nature and in the course of selection of new drugs. However, one should not overlook two essential assumptions underlying the above reasoning: (i) equilibrium of binding of the ligand to various binding sites is achieved and (ii) strong inequality is valid as in Eq. ( 16) for any non-specific site. These rather restrictive assumptions cannot be universally valid and cases certainly exist for which a vHB-type treatment is inapplicable and its conclusions fail. Such cases present special interest to us because of DNA with oligonucleotides and their they are met in the field of interaction analogs. 9.1.2. DNA case Specificity of interaction between nucleic acids plays a crucial role in fundamental biological processes of replication, transcription, translation and genetic recombination. The ability of DNA to form highly specific complexes underlie the most important biotechnological applications of DNA, such as various hybridization techniques, polymerase chain reaction (PCR), etc. In contrast to other cases of biomolecular recognition (DNA/protein, protein/protein, etc.), where there is no obvious universal principle, in case of DNA such remarkably simple general principle, the complementarity rule, is available. As a result, the case of DNA interactions may be subjected to a comprehensive theoretical treatment within the framework of simple models, in which real molecules are stripped of all unnecessary details and only features important for answering basic questions about specificity of interaction are left. We have recently treated such models using the kinetic Monte Carlo approach (Lomakin and Frank-Kamenetskii, 1997). Our main conclusions are as follows. We have found that by changing the values of parameters of the model one can either achieve high affinity and poor specificity or high specificity and poor affinity but never both. Therefore, in contrast to the predictions of the vHB model, for which specificity correlates with affinity (von Hippel and Berg, 1986; Eaton et al., 1995; see above), in case of binding of oligonucleotides to DNA via Watson-Crick pairing, affinity and specificity anti-correlate with each other. Obviously, the vHB model fails in this case because the strong inequality in Eq. (16) is not valid. Our conclusion apparently contradicts well-known facts of a great success of Watson-Crick recognition in uncountable applications, such as hybridization techniques (Southern blotting, in situ hybridization), Sanger sequencing, PCR, etc. How could all these methods possibly work if the Watson-Crick pairing were either very weak or non-specific? There is no real contradiction between our findings and the success of the above techniques. To be successful, these techniques do not A4. D. Flunk-Kumc~net.vkiil 52 Physics Reports 288 (1997) 13-60 require sequence recognition to be so stringent as we understand it. The target sites and corresponding oligonucleotides are sufficiently long so that the probability to encounter sites with very few mismatches is negligible. Thus, in many practical cases of recognition of single-stranded DNA by oligonucleotides via Watson-Crick pairing, stringent sequence specificity is not necessary. However, other cases definitely exist for which such stringent recognition is essential. Similar conclusions are valid for recognition of sites on duplex DNA due to the Hoogsteen mode of binding (the triplex formation, see Section 2.6). 9.2. Irreversible binding The above consideration was grounded on the assumption that equilibrium is reached during the time of experiment. This is not necessarily the case for interactions between DNA molecules, as we already discussed at some length in Section 6.2. According to Eq. (lo), one can encounter large relaxation times in two alternative cases. One can either deal with the s value close to unity but with very large m values (several hundreds) (which is the case for DNA melting, see Section 6.2) or with not that large an m value (of about 10) but with the s value significantly larger than unity. The most striking example of the latter situation is presented in case of DNA interaction with DNA artificial analog, the peptide nucleic acid (PNA). 9.2.1. Peptide nucleic acid PNA is a DNA synthetic usual DNA bases attached DNA, backbone (Fig. 18). negative charges. (PNA) analog, which was put forward by Nielsen et al. (1991). It consists of to a totally different backbone, which reminds the protein, rather than PNA backbone is neutral, in contrast to DNA backbone, which carries 9.2.1.1. PNAIDNA duplexes. PNA oligomers form duplexes with complementary PNA and DNA chains (Wittung et al., 1994; Egholm et al., 1993). However, because of neutrality of the PNA backbone, melting temperature (T,) of PNA/PNA duplexes does not depend on the ionic strength (see Fig. 19). Interestingly, the T, value for PNA/DNA duplexes decreases with increasing ionic strength, in sharp contrast the DNA/DNA duplexes (see Fig. 19). This unusual behavior of the PNA/DNA duplex has made it possible to subject the DNA polyelectrolyte theory discussed in Section 8 to a new critical test. Tomac et al. (1996) have shown that this behavior cannot be consistently explained by the Manning’s condensation theory but finds its explanation within the framework of DNA cylinder model, like it was done for DNA/DNA duplexes by Frank-Kamenetskii et al. (1987) and Bond et al. (1994) (see also Section 8.1.2.1). 9.2.1.2. Complexes of PNA with duplex DNA. Difference between PNA and DNA oligomers manifests itself most strikingly in their interaction with duplex DNA. Only homopyrimidine PNAs (containing two types of bases, T and C) are known to form stable complexes with duplex DNA. However, whereas homopyrimidine DNA oligomers form (DNA)3 triplexes with corresponding sites on duplex DNA (Section 2.6; reviewed by Frank-Kamenetskii and Mirkin, 1995; Soyfer and Potaman, 1966), homopyrimidine PNA oligomers form a totally different structure with the same DNA sites. Physics Reports 288 (1997J M.D. Frank-Kamenetskiil 53 13-60 60 20 10 0 -2.8 -2.4 -2 -1.6 -1.2 -0.8 -0.4 log[Na+] M Fig. 19. Salt dependence of melting temperature (T,,) of PNAjPNA (upper DNA/DNA (lower curve) duplexes. The data are from Tomac et al. (1996). Fig. 20. The P-loop formation curve), PNA/DNA (middle curve) and (see the text for explanation). Two PNA oligomers form triplex with the homopurine strand of the double helix leaving the homopyrimidine DNA strand displaced in a single-stranded form (Fig. 20). Such structure consisting of (PNA),/DNA triplex and a displaced DNA strand (Cherny, 1993b; Demidov et al., 1995) is called the P-loop. A major element of the P-loop is of course the (PNA),/DNA triplex. The P-loop is formed contrary to the formation of unfavorable helix boundaries only because the triplex is remarkably stable. Why is this triplex much more stable than the canonical (DNA)3 triplex? Two factors contribute to this effect. First, whereas in the (DNA)3 triplex three negatively charged strands experience strong electrostatic repulsion from each other, in (PNA),/DNA triplex such repulsion is totally absent. The second factor follows from an X-ray crystallographic study of the triplex by Betts et al. (1995). 54 M.D. Frank-Karnenetskiil Physics Reports 288 (1997) 13WS According to the structural data, in addition to Watson-Crick and Hoogsteen pairing the complex is stabilized by hydrogen bonds between amid nitrogens of the backbone of the PNA strand, which forms Hoogsteen pairs with the DNA strand, and the phosphate oxygens of the DNA backbone. These additional hydrogen bonds, together with the lack of electrostatic repulsion within the triplex, make the (PNA)JDNA triplexes the most stable nucleic acid-like complexes known to date. Because two PNA oligomers are involved in the recognition process, PNA clamps or bis-PNAs, which carry two PNA oligomers connected by a flexible linker, are often used to target homopurine sites on nucleic acids. Additional stabilization of the complex is achieved by incorporating positive charges into PNA or bis-PNA oligomers (Demidov et al., 1996; Veselkov et al., 1996a, b). 9.2.2. Specijicity of PNA interaction u,ith duplex DNA Thus, homopyrimidine PNAs exhibit exceptionally high affinity to their target sites on DNA due to unique properties of the PNA backbone. At the same time specificity of PNA-DNA interaction is governed by essentially the same factors as in the case of usual DNA recognition because specificity is determined by the same Watson-Crick and Hoogsteen base pairing. One should therefore expect PNA to exhibit very poor sequence specificity. Amazingly, experiment shows that conditions exist under which exceptionally high affinity of homopyrimidine PNAs to its target sites on dsDNA is supplemented by remarkable specificity of interaction (Demidov et al., 1995; Veselkov et al., 1996a, b). This finding led Demidov et al. (1996) to the conclusion that in case of PNA-DNA complexes we encounter a new principle of biomolecular recognition. We concluded that the affinity of PNAs to their DNA target sites is so high that the binding should be considered as irreversible. If this is true, the vHB-type treatment obviously fails. Demidov et al. (1996) concluded, however, that although the overall process is irreversible, its first stage is highly reversible and consists in fluctuational opening of the DNA double helix (see Section 7.2) and in transient formation of the Watson-Crick duplex between one PNA oligomer and the complementary DNA strand. Only after the triplex is formed because of the association of the second PNA oligomer, the complex becomes irreversible. Thus, a two-stage mechanism of the complex formation between duplex DNA and homopyrimidine PNA makes it possible to reach both, the unprecedented affinity and very high sequence-specificity of the interaction. First, a highly reversible “search” stage takes place, which is followed by an irreversible “locking” stage. Our theoretical analysis shows that in a wide range of parameters a model behaves in general agreement with this simple description (Lomakin and Frank-Kamenetskii, 1997). It should be emphasized that because of extremely high affinity of interaction of homopyrimidine PNAs with duplex DNA, the final state for a target site containing a small number of mismatches (one or two) as compared with the correct site will also correspond to its almost full occupation. So to reach optimal sequence specificity one has to be very careful in choosing the appropriate time of incubation of DNA with PNA, which should be sufficiently long to secure high occupancy of the correct site but should not be too long to prevent the mismatched site to be significantly occupied. In other words, in case of PNA/DNA interaction we deal with kinetic discrimination between correct and mismatched sites. Demidov et al. (1997) have recently subjected this question to a detailed theoretical analysis. Extensive further studies are needed to prove the proposed mechanism of PNA interaction with duplex DNA. However, the very fact that PNA binds to duplex DNA with extremely high affinity M.D. Frank-Kumenetskii I Physics Reports 288 (1997) 13-60 55 and specificity has been proved beyond doubts (Demidov et al., 199.5; Veselkov et al., 1996a, b). The data of Veselkov et al. (1996a, b) are especially convincing because they demonstrate how these features of PNA make it possible to develop a very efficient method for cleaving long DNA molecules (consisting of millions base pairs) in very limited and fully specific sites. Acknowledgements This work was supported in part by NIH grant GM52201. References Anderson, C.F., Record, M.T., 1982. Polyelectrolyte theories and their applications to DNA. Annu. Rev. Phys. Chem. 33, 191-222. Anderson, CF., Record, M.T., 1995. Salt-nucleic acid interactions. Annu. Rev. Phys. Chem. 46, 657-700. Anshelevich, V.V., Vologodskii, A.V., Lukashin, A.V., Frank-Kamenetskii, M.D., 1979. Statistical-mechanical treatment of violations of the double helix in supercoiled DNA. Biopolymers 18, 2733-2744. Anshelevich, V.V., Vologodskii, A.V., Lukashin, A.V., Frank-Kamenetskii, M.D., 1984a. Slow relaxational processes in the melting of linear biopolymers. A theory and its application to nucleic acids. Biopolymers 23, 39-58. Anshelevich, V.V., Lukashin, A.V., Frank-Kamenetskii, M.D., 1984b. Towards an exact theory of polyelectrolytes. Chem. Phys. 91, 225-236. Anshelevich, V.V., Vologodskii, A.V., Frank-Kamenetskii, M.D., 1988. A theoretical study of formation of DNA noncanonical structures under negative superhelical stress. J. Biomol. Struct. Dyn. 6, 2477259. Azbel’, M.Y., 1972. The inverse problem for DNA. JETP Lett. 16, 128-13 1. Berger, J.M., Gamblin, S.J., Harrison, SC., Wang, J.C., 1996. Structure and mechanism of DNA topoisomerase II. Nature 379, 225-232. Betts, L., Josey, J.A., Veal, J.M., Jordan, S.R., 1995. A nucleic acid triple helix formed by a peptide nucleic acid-DNA complex. Science 270, 1838-l 84 1. Bond, J.P., Anderson, C.F., Record, M.T., 1994. Conformational transitions of duplex and triplex nucleic acid helices: thermodynamic analysis of effects of salt concentration on stability using preferential interaction coefficients. Biophys. J. 67, 825-836. Breslauer, K.J., Freire, E., Straume, M., 1992. Calorimetry: a tool for DNA and ligand-DNA studies. Methods Enzymol. 211, 533-567. Bustamante, C., Marko, J.F., Siggia, E.D., Smith, S., 1994. Entropic elasticity of lambda-phage DNA. Science 265, 1599-1600. Cantor, CR., Schimmel, P.R., 1980. Biophysical Chemistry. Freeman, San Francisco. Chen, L., Cai, L., Zhang, X., Rich, A., 1994. Crystal structure of a four-stranded intercalated DNA: d(C4). Biochemistry 33, 13 540-13 546. Cherny, D.I., Malkov, V.A., Volodin, A.A., Frank-Kamenetskii, M.D., 1993a. Electron-microscopy visualization of oligonucleotide binding to duplex DNA via triplex formation. J. Mol. Biol. 230, 379-383. Chemy, D.Y., Belotserkovskii, B.P., Frank-Kamenetskii, M.D., Egholm, M., Buchardt, O., Berg, R.H., Nielsen, P.E., 1993b. DNA unwinding upon strand-displacement binding of a thymine-substituted polyamide to double-stranded DNA. Proc. Natl. Acad. Sci. USA 90, 1667-1670. Clegg, R., 1992. Fluorescence resonance energy transfer and nucleic acids. Methods Enzymol. 211, 353-388. Demaret, J.P., Gueron, M., 1993. Composite cylinder models of DNA: application to the electrostatics of the B-Z transition, Biophys. J. 65, 1700-1713. Demidov, V.V., Yavnilovich, M.V., Belotserkovskii, B.P., Frank-Kamenetskii, M.D., Nielsen, P.E., 1995. Kinetics and mechanism of polyamide (“peptide”) nucleic acid binding to duplex DNA. Proc. Natl. Acad. Sci. USA 92, 2637-2641. 56 M. D. Frank-Kammetskii /Pllysics Reports 288 i 1997) 13&X) Demidov, V.V., Frank-Kamenetskii, M.D., Nielsen, E.P., 1996. Complexes of duplex DNA with homopyrimidine pcptide nucleic acid (PNA): new principle of biomolecular recognition, In: R.H. Sarma, M.H. Sarma (Eds.), Biomolecular Structure and Dynamics, vol. 2. Adenine Press, New York, pp. 1299134. Demidov, V.V., Yavnilovich, M.V., Frank-Kamenetskii, M.D., 1997. Kinetic analysis of specificity of duplex DNA targeting by homopyrimidine PNAs. Biophys. J. 72, N6. Depew, R.E., Wang, J.C., 1975. Conformational fluctuations of DNA helix. Proc. Nat]. Acad. Sci. USA. 72, 421554219. Deruchi, T., Tsurusaki, K., 1993a. A new algorithm for numerical calculation of link invariants. Phys. Lett. A 174, 29-37. Deruchi, T., Tsurusaki, K.. 1993b. Topology of closed random polygons. J. Phys. Sot. Japan 62, 1411-1414. Deruchi, T., Tsurusaki, K., 1994. A statistical study of random knotting using the Vassiliev invariants. J. Knot Theory and Its Ramifications 3, 321-353. Dickerson, R.E., 1992. DNA structure from A to Z. Methods Enzymol. 211, 67-l 11. Doktycz, M.J., Goldstein, R.F., Paner, T.M., Gallo, F.J., Benight, A.S., 1992. Studies of DNA dumbbells. I. Melting curves of 17 DNA dumbbells with different duplex stem sequences linked by T4 endloops: evaluation of the nearest-neighbor stacking interactions in DNA. Biopolymers 32, 8499864. Doktycz, M.J., Morris, M.D., Dormady, S.J., Beattie, K.L., Jacobson, K.B., 1995. Optical melting of 128 octamer DNA duplexes. Effects of base pair location and nearest neighbors on thermal stability. J. Biol. Chem. 270, 8439-8445. Dubochet, J., Adrian, M., Dustin, I., Furrer, P., Stasiak, A., 1992. Criolelectron microscopy of DNA molecules in solution. Methods Enzymol. 2 11, 50775 18. Eaton, B.E., Gold, L., Zichi. D.A., 1995. Let’s get specific: the relationship between specificity and affinity. Chemistry & Biology 2, 633-638. Egholm, M., Buchardt, O., Christensen, L., Behrens, C., Freier, SM., Driver, D.A., Berg, R.H., Kim, S.K., Norden, B., Nielsen, P.E., 1993. PNA hybridizes to complementary oligonucleotides obeying the Watson-Crick hydrogen-bonding rules. Nature 365, 5666568. Fenley, M.O., Manning, G.S., Olson, W.K., 1990. A numerical counterion condensation analysis of the B-Z transition of DNA. Biopolymers 30, 120551213. Fixman, M., 1979. The Poisson-Boltzmann equation and its application to polyelectrolytes. J. Chem. Phys. 70. 49955005. Fixman, M., 1982. The flexibility of polyelectrolyte molecules. J. Chem. Phys. 76, 6346-6353. Fixman, M., Freire, J.J., 1977. Theory of DNA melting curves. Biopolymers 16, 2693-2704. Frank-Kamenetskii, M.D., 1981. Fluctuations in DNA. Comments Mol. Cell. Biophys. I, 105-l 14. Frank-Kamenetskii, M.D., 1985. Fluctuational motility of DNA. In: E. Clementi et al. (Eds.), Structure & Motion: Membranes, Nucleic Acids & Proteins. Adenine Press, New York, pp. 417-432. Frank-Kamenetskii, M.D., 1987. How the double helix breathes. Nature 328, 17-18. Frank-Kamenetskii, M.D., 1993. Unraveling DNA. VCH, New York. Frank-Kamenetskii, M.D., 1997. Unraveling DNA. The Most Important Molecule of Life. Addison Wesley. Reading. Frank-Kamenetskii, M.D., Frank-Kamenetskii, A.D.. 1969. Theory of helix-coil transition for the case of double stranded DNA. Molek. Biol. 3, 375-382. Frank-Kamenetskii, M.D., Lazurkin, Y.S., 1974. Conformational changes in DNA molecules. Annu. Rev. Biophys. Bioeng. 3, 127-150. Frank-Kamenetskii, M.D., Lukashin, A.V., Vologodskii, A.V., 1975. Statistical mechanics and topology of polymer chains. Nature 258, 3988399. Frank-Kamenetskii, M.D., Lukashin, A.V., Anshelevich, V.V., Vologodskii, A.V., 1985a. Topsional and bending rigidity of the double helix from data on small DNA rings. J. Biomol. Struct. Dyn. 2, 1005-1012. Frank-Kamenetskii, M.D., Lukashin, A.V., Anshelevich, V.V., 1985b. Application of polyeiectrolyte theory to the study of the B-Z transition in DNA. J. Biomol. Struct. Dyn. 3, 35542. Frank-Kamenetskii, M.D., Anshelevich, V.V., Lukashin, A.V., 1987. Polyelectrolyte model of DNA. Sov. Phys. Usp. 30, 3 177330. Frank-Kamenetskii, M.D., Mirkin, S.M., 1995. Triplex DNA structures. Ann. Rev. Biochem. 64, 65-95. Frank-Kamenetskii, M.D., Vologodskii, A.V., 1981. Topological aspects of polymer physics: theory and its biophysical applications. Sov. Phys. Usp. 24, 679-696. Frank-Kamenetskii, M.D., Vologodskii, A.V., 1984. Thermodynamics of the B-Z transition in superhelical DNA. Nature 307, 481-482. Frisch, H.L.. 1993. Macromolecular topology. Metastable isomers from pseudo interpenetratrng polymer networks. New J. Chem. 17, 697-701. Fuller, F.B., 1971. The writhing number of a space curve Proc. Natl. Acad. Sci. USA 68. 815-819. Gehring, K., Leroy, J.L.. Gueron, M., 1993. A tetrameric DNA structure with protonated cytosine-cytosine base pairs. Nature 363, 561-565. Gotoh, 0.. Tagashira. Y., 1981. Stabilities of nearest-neighbor doublets in double-helical DNA determined by fitting calculated melting profiles to observed profiles. Biopolymers 20, 103331042. Grosberg. AI., Khokhlov. A.R., 1994. Statistical Physics of Macromolecules. AIP Press, New York. Gucron, M.. Demaret, J.P., 1992. A simple explanation of the electrostatics of the B-to-Z transition of DNA. Proc. Natl. Acad. Sci. USA 89. 5740-5743. Gucron, M.. Kochoyan, M., Leroy, J.L., 1987. A single mode of DNA base-pair opening drives imino proton exchange. Nature 328, 89-92. Gueron, M., Leroy. J.L., 1995. Studies of base pair kinetics by NMR measurement of proton exchange. Methods Enzymol. 261. 3833413. Gueron, M., Wcisbuch, G., 1980. Polyelectrolyte theory. I. Counterion accumulation, site-binding, and their insensitivity to polyelectrolytc shape in solutions containing finite salt concentrations. Biopolymers 19, 3533382. Horowitz, D.S., Wang, J.C., 1984. Torsional rigidity of DNA and length dependence of the free energy of DNA supercoiling. J. Mol. Biol. 173, 75-91. Ivanov, V.I., Krylov, D.Y., 1992. A-DNA in solution as studied by diverse approaches. Methods Enzymol. 21 I, I I I-127. Ivanov. V.I., Karapctian, A.T., Minyat, E.E., Sagi, J.. 1993. Z-form of DNA. Non-monotonous change in stability with increase of ionic strength, Molek. Biol. (Moscow) 27. 1150&l 156. Johnson, W.C., 1990. In: Saenger, W. (Ed.), Landolt-Bornstein Series. Group VII: Biophysics. Nucleic Acids, Subvol. IC. Springer. Heidelberg. pp. I-59. Kalambet, Y.A.. Borovik. A.S., Lyamichev. V.I., Lyubchenko, Y.L., 1985. Electron microscopy of the melting of sequenced DNA. Biopolymers 24, 3599377. Kang. CH., Zhang, X., Ratliff, R., Moyzis. R., Rich, A., 1992. Crystal structure of four-stranded Oxytricha telomeric DNA. Nature 356, 126-131. Klenin, K.V., Vologodskii, A.V., Anshelevich. V.V., Dykhne, A.M., Frank-Kamenetskii, M.D., 1988. Effect of excluded volume on topological properties of circular DNA. J. Biomol. Struct. Dyn. 5. 1173-l 185. Klenin, K.V., Vologodskii, A.V., Anshelevich. V.V., Klishko. V.Y.. Dykhne, A.M., Frank-Kamenetskii, M.D., 1989. Variance of writhe for wormlike DNA rings with excluded volume. J. Biomol. Struct. Dyn. 6, 707-714. Klcnin, K.V., Vologodskii. A.V.. Anshelevich, V.V., Dykhnc, A.M.. Frank-Kamenetskii, M.D., 1991. Computer stimulation of DNA supercoiling. J. Mol. Biol. 217, 4133419. Kozyavkin, S.A., Lyubchenko, Y.L., 1984. The nonequilibrium character of DNA melting: effects of heating rate on the fine structure of melting curves. Nucl. Acids Res. 12, 4339-4349. Kozyavkin, S.A.. Naritsin, D.B.. Lyubchenko. Y.L., 1986. The kinetics of DNA helix-coil subtransitions. J. Biomol. Struct. Dyn. 3, 6899704. Le Bret, M., 1982. Electrostatic contribution to the persistence length of a polyelectrolyte. J. Chem. Phys. 76, 6243-6255. Lc Doan. T.. Perrouault, L., Praseuth, D., Habhoub, N., Decout, J.L., Thuong, N.T., Lhomme, J.. Helene, C., 1987. Sequence-specific recognition. photocrosslinking and cleavage of the DNA double helix by an oligo-[alpha]-thymidylate covalently linked to an azidoproflavine derivative. Nucleic Acids Res. 15, 774997760. Lifshitz, I.M., 1973. On the statistical thermodynamics of fusion of long heteropolymer chains. Sov. Phys.-JETP 65, 1100~1110. Liu, L., Depew. R.E., Wang, J.C., 1976. Knotted single-stranded DNA rings: a novel topological isomer of circular single-stranded DNA formed by treatment with Esclwichiu co/i CL)protein. J. Mol. Biol. 106, 4399452. Liu, L.F., L.iu. CC.. Alberts, B.M., 1980. Type II topoisomerases: enzymes that can unknot a topologically knotted DNA molecule via a reversible double-strand break. Cell 19, 697-707. Lomakin, A., Frank-Kamenetskii, M.D., 1997. A theoretical analysis of specificity of nucleic acid interactions with ohgonucleotides and peptide nucleic acids (PNAs). In preparation. Lukashin, A.V., Vologodskii, A.V., Frank-Kamenetskii. M.D., 1976. Comparison of different theoretical descriptions of helix-coil transition in DNA. Biopolymers 15, 1841-1844. 58 M.D. Frank-Kamenrtskii I Physics Reports 288 (1997) 13-60 Lukashin, A.V.. Vologodskii, A.V., Frank-Kamenetskii, M.D., Lyubchenko, Y.L., 1976. Fluctuational opening of the double helix as revealed by theoretical and experimental study of DNA interaction with formaldehyde. J. Mol. Biol. 108, 665-682. Lukashin, A.V., Beglov, D.B., Frank-Kamenetskii, M.D., 1991a. Comparison of different approaches for calculation of polyelectrolyte free energy. J. Biomol. Struct. Dyn. 8, 1113-I 118. Lukashin, A.V., Beglov, D.B., Frank-Kamenetskii, M.D., 199lb. Allowance for spatial dispersion of dielectric permittivity in polyelectrolyte model of DNA. J. Biomol. Struct. Dyn. 9, 517-523. Lyamichev, V.I., Mirkin, SM., Frank-Kamenetskii, M.D., 1985. A pH-dependent structural transition in the homopurinehomopyrimidine tract in superhelical DNA. J. Biomol. Struct. Dyn. 3, 327-338. Lyamichev, V.I., Mirkin, SM., Frank-Kamenetskii, M.D., 1986. Structures of homopurine-homopyrimidine tract in superhelical DNA. J. Biomol. Struct. Dyn. 3, 6677669. Lyamichev, V.I., Mirkin, S.M., Frank-Kamenetskii, M.D., Cantor, CR., 1988. A stable complex between homopyrimidine oligomers and the homologous regions of duplex DNA. Nucl. Acids Res. 16, 2165-2178. Lyamichev, V.I., Frank-Kamenetskii, M.D.. Soyfer. V.N.. 1990. UV-protection of homopurine-homopyrimidine regions in DNA by pyrimidine oligonucleotides due to triplex formation. Nature 344, 5688570. Lyamichev, V.I.. Voloshin, O.N., Frank-Kamenetskii, M.D., Soyfer, V.N., 1991. Photofootprinting of DNA triplexes, Nucl. Acids Res. 19, 163331638. Lyubchenko, Y.L., Vologodskii, A.V., Frank-Kamenetskii, M.D., 1978. Direct comparison of theoretical and experimental melting profiles for (9 Xl74 DNA. Nature 271, 28-31. Mandal, C., Kallenbach, N.R., Englander, SW., 1979. Base-pair opening and closing reactions in the double helix. J. Mol. Biol. 135, 391-41 I. Manning, G.S., 1969. Limiting laws and counterion condensation in polyelectrolyte solutions. 1. Colligative properties. J. Chem. Phys. 51, 924-933. Manning, G.S., 1972. Polyelectrolytes. Ann. Rev. Phys. Chem. 23, 117-140. Marko, J.F., Siggia, E.D., 1994. Fluctuations and supercoiling of DNA. Science 265, 506-508. Marko, J.F., Siggia, E.D., 1995. Statistical mechanics of supercoiled DNA. Phys. Rev. E 52, 2912-2938. Marmur. J., Doty, P., 1962. Determination of the base composition of deoxyribonucleic acid from its thermal denaturation temperature. J. Mol. Biol. 5, 109-118. Mirkin, S.M., Lyamichev, V.I., Drushlyak, K.N., Dobrynin, V.N., Filippov, S.A., Frank-Kamenetskii, M.D., 1987. DNA H form requires a homopurinc-homopyrimidine mirror repeat. Nature 330, 4955497. Mirkin, S.M., Frank-Kamenetskii, M.D., 1994. H-DNA and related structures. Ann. Rev. Biophys. Biomol. Struct. 23, 541-576. Misra, V.K., Honig, B., 1996. The electrostatic contribution to the B to Z transition of DNA. Biochemistry 35, I I 15-I 124. Moser, H.E., Dervan, P.B., 1987. Sequence-specific cleavage of double helical DNA by triple helix formation. Science 238, 6455650. Nielsen, P.E.. Egholm, M., Berg, R.H., Buchardt, 0.. 1991. Sequence-selective recognition of DNA by strand displacement with a thymine-substituted polyamide. Science 254, 1497-l 500. Peliti, L. (Ed.), 1990. Biologically Inspired Physics. Plenum Press, New York. Poland, D., 1974. Recursion relation generation of probability profiles for specific-sequence macromolecules with longrange correlations. Biopolymers 13, 1859-l 87 I. Pulleyblank, D.E., Shure, D.E., Tang, D., Vinograd, J., Vosberg, H.-P., 1975. Action of nicking-closing enzyme on supercoiled and nonsupercoiled closed circular DNA: formation of a Boltzmann distribution of topological isomers. Proc. Natl. Acad. Sci. USA 72, 4280-4284. Rajagopal, P., Feigon, J., 1989. Triple-strand formation in the homopurine: homopyrimidine DNA oligonucleotides d(G-A)4 and d(T-C)?. Nature 339, 637-640. Ramanathan, G.V., Woodbury, C.P., 1982. Statistical mechanics of electrolytes and polyelectrolytes, II. Counterion condensation on a line charge. J. Chem. Phys. 77, 413334140. Rippe, K., Jovin, T.M., 1992. Parallel-stranded Duplex DNA. Methods Enzymol. 21 I, 1999220. Rybenkov, V.V., Cozzarelli, N.R., Vologodskii, A.V., 1993. Probability of DNA knotting and the effective diameter of the DNA double helix. Proc. Natl. Acad. Sci. USA 90, 5307753 11. Sali, A., Shakhnovich, E., Karplus, M., 1994. How does a protein fold? Nature 369, 2488251. M.D. Funk-Kunwnrtskiil Physics Reports 288 (1997) 13-60 59 SantaLucia, J., Allawi, H.T., Seneviratne, P.A.. 1996. Improved nearest-neighbor parameters for predicting DNA duplex stability. Biochemistry 35, 3555-3562. Schwartz, D.C., Cantor, CR., 1984. Separation of yeast chromosome-sized DNAs by pulsed field gradient gel elcctrophoresis. Cell 37, 67-75. Selingcr, J.V., Selinger, R.L.B., 1996. Theory of chiral order in random copolymers. Phys. Rev. Lett. 76, 58-61. Shao. Z., Yang, J., Somlyo, A.P., 1995. Biological atomic force microscopy: from microns to nanometers and beyond. Annu. Rev. Cell. Develop. Biol. 11, 241-265. Shaw, S.Y., Wang, J.C., 1993. Knotting of a DNA chain during ring closure, Science 260, 533-536. Shugalii, A.V., Frank-Kamenerskii, M.D., Lazurkin. Y.S., 1969. A viscosimetric study of the helix-coil transition in phage T2 DNA, Molek. Biol. 3, 133-145. Sinden, R.R.. 1994. DNA Structure and Function, Academic Press, San Diego. Smith, S.B., Finzi, L., Bustamante, C., 1992. Direct mechanical measurements of the elasticity of single DNA molecules by using magnetic beads, Science 258, 1 122-I 126. Smith, S.B., Cui, Y., Bustamante, C., 1996. Overstretching B-DNA: the elastic response of individual double-stranded and single-stranded DNA molecules, Science 27 I, 795-789. Soyfer, V.N., Potaman, V.V., 1996. Triple-helical Nucleic Acids, Springer, New York. Spenglcr, S.J., Stasiak, A.. Cozzarelli, N.R., 1985. The stereostructure of knots and catenanes produced by phage lambda integrative recombination: implications for mechanism and DNA structure Cell 42, 325-334. Stigter. D., 1977. Interactions of highly charged colloidal cylinders with applications to double-stranded DNA, Biopolymers 16, 1435-1448. Strick, T.R., Allemand, J.F., Bensimon, D., Bensimon, A., Croquette, V., 1996. The elasticity of a single supercoiled DNA molecule Science 271, 1835-1837. Taylor, W.H., Hagerman, P.J.. 1990. Application of the method of phage T4 DNA ligase-catalyzed ring-closure to the study of DNA structure. II. NaCl-dependence of DNA flexibility and helical repeat. J. Mol. Biol. 212, 363-376. Tomac, S., Sarkar, M., Ratilainen, T.. Wittung, P., Nielsen, P.E.. Norden, B., Graslund, A., 1996. Ionic effects on the stability and conformation of peptide nucleic acid complexes. J. Am. Chem. Sot. 24, 5544-5552. Tricot, M., 1994. Comparsion of experimental and theoretical persistence length of some polyelectrolytes at various ionic strengths. Macromolecules 17. 1698-1704. Vedenov, AA., Dykhne, A.M., Frank-Kamenetskii, A-D., Frank-Kamenetskii, M.D., 1967. A contribution to the theory of helix-coil transition in DNA. Molek. Biol. I, 313-319. Vedenov, A.A., Dykhne, A.M.. Frank-Kamenetskii. M.D., 1971. The helix-coil transition in DNA. Soviet Physics ~ Usp. 14, 715 -736. Veselkov, A.G., Demidov, V.V., Frank-Kamenetskii, M.D., Nielsen, P.E., 1996a. PNA as a rare genome-cutter. Nature 379, 214. Vesclkov, A.G., Demidov. V.V., Nielsen, P.E., Frank-Kamenetskii, M.D., 1996b. A new class of genome rare cutters. Nucleic Acids Res. 24, 2483-2488. Vologodskii, A.V., Lukashin, A.V., Frank-Kamenetskii, M.D., Anshelevich, V.V., 1974. The knot problem in statistical mechanics of polymer chains. Sov. Phys. JETP 39, 1059-1063. Vologodskii, A.V., Anshelevich, V.V., Lukashin. A.V., Frank-Kamenetskii, M.D.. 1979a. Statistical mechanics of supercoils and the torsional stiffness of the DNA double helix. Nature 280, 294-298. Vologodskii, A.V., Lukashin, A.V., Anshelevich. V.V., Frank-Kamenetskii, M.D., 1979b. Fluctuations in superhelical DNA. Nucleic Acids Res. 6, 967-982. Vologodskii, A.V., Amirikyan, B.R., Lyubchenko, Y.L., Frank-Kamenetskii, M.D., 1984. Allowance for heterogeneous stacking in the DNA helix-coil transition theory. J. Biomol. Struct. Dyn. 2, 131&148. Vologodskii, A.V., Frank-Kamenetskii, M.D.. 1982. Theoretical study of cruciform states in superhelical DNA. FEBS Lett. 143, 257-260. Vologodskii, A.V., Frank-Kamenetskii, 1984. Left-handed Z form in superhelical DNA: a theoretical study, J. Biomol. Struct. Dyn. 1, 1325-1333. Vologodskii, A.V.. Frank-Kamenetskii, M.D., 1992. Modeling DNA supercoiling. Methods Enzymol. 211, 467-480. Vologodskii, A.V., Levene, SD., Klenin, K.V., Frank-Kamenetskii, M.D., Cozzarelli, N.R., 1992. Conformational and thermodynamic properties of supercoiled DNA. J. Mol. Biol. 227, 1224-1243. 60 M.D. Fwnk-Kumenrtskiil Ph~~sics Rqmrts 288 (I 997j 13VjO von Hippel, P.H., Berg, O.G., 1986. On the specificity of DNA-protein interactions. Proc. Natl. Acad. Sci. USA 83, 16081612. Wada, A., Suyama, A., 1986. Local stability of DNA and RNA secondary structure and its relation to biological functions, Prog. Biophys. Mol. Biol. 47, 1133157. Wada, A., Yabuki, S., Husimi, Y., 1980. Fine structure in the thermal denaturation of DNA: high temperature-resolution spectrophotometric studies. CRC Crit. Rev. Biochem. 9, 877144. Wang, J.C.. 1996. DNA topoisomerase. Annu. Rev. Biochem. 65, 635-692. Wang, A.H.-J., Quigley, G.J., Kolpak, F.K., Crawford, J.L., van Boom, J.H., van der Marel. G., Rich. A.. 1979. Molecular structure of a left-handed double helical DNA fragment at atomic resolution. Nature 282, 680-685. Wartell, R.M., Benight, A.S.. 1985. Thermal denaturation of DNA molecules: a comparison of theory with experiment. Phys. Rep. 126, 67-107. Watson, J.D., Crick, F.H.C. 1953. Molecular structure of nucleic acids. Nature 171, 7377738. White, J.H., 1969. Self-linking and the Gauss integral in higher dimensions. Am. J. Math. 91, 6933728. Wittung, P., Nielsen, P.E., Buchardt, O., Egholm, M., Norden, B., 1994. DNA-like double helix formed by peptide nucleic acid. Nature 368. 561-563. Zimm, B.H., Le Bret, M., 1983. Counter-ion condensation and system dimensionality. J. Biomol. Struct. Dyn. I, 461-471.