Chemistry/biochemistry of nucleic acids Chemical structure : nomenclature : A, G, C, T, N, R, Y, S, W, M, K, B, D, H, V, (U) heterocyclic bases, nucleosides, nucleotides phosphate group : , , position 1 kb is about 0,33 m long 1 m DNA is about 3 kb 3.000.000 kb is 1 m DNA symmetry: direct repeat inverted repeat palindrome ‘true palindrome’ atomic modifications (the radioactive isotopes shown here are mostly -emitters) isotopic (underlined=stable) or radioactive labeling 31 P => 32P, 33P 12 C => 13C, 14C 14 N => 15N H => T (3H) other changes P=O => P=S P-O- => P-S(nb. remember tautomerism !!!) 32 => S may be replaced by the radioactive S35 Quantitation and characterization - UV absorption over the wavelength range 200-350 nm - characteristic spectrum for each of the nucleotides - composite spectrum has optimum around 260 nm (homopolymers have different spectra) A260 = c.b. (c = concentration, b = path length in cuvette, = molar extinction coefficient) => molar concentration of nucleotide c = A260 / b. Some chemical reactions internucleotide bonds of RNA hydrolysed at alkaline pH => 2’-3’ cyclic phosphate intermediate => total hydrolysis to 2’- and 3’-phosphate groups at acidic pH : N-glycosidic bonds in DNA are hydrolysed A > G >> Y can further lead to strand scission (Burton depurination with diphenylamine) de-amination by sodium nitrite (NaNO2) at around pH 4.5 (via diazonium intermediate derivatives) adenosine => inosine (base = hypoxanthine) cytidine => uridine guanosine => xanthosine (base = xanthine) relative rate ratios of 1:2:6 for G:A:C, respectively de-amination of C by bisulfite (NaHSO3) alkylation at N-7 of purines by e.g. dimethylsulfate => labile glycosidic bond => depurination also : alkylation of phosphate moiety (acidic target!) => phosphostriester => subsequently: in strong alkali : either loss of alkyl group or strand cleavage (in strong acidic conditions: regeneration of the phosphodiester) RNA : end-group = cis diol : oxidized by NaI04 to di-aldehyde => selective modification of 3’ terminal of RNA Conformational features furanose puckering base-pairing stacking interactions: stronger between purines, weaker between pyrimidines hyperchromicity versus hypochromicity Common DNA and RNA conformations A, B, Z, … duplex helical structure and cruciform structures internal loop, bulge loop, hairpin loop linear versus circular forms (stem-loop structure) superhelical (supercoiled) DNA L=T+W L = T + W T + W = 0 intercalation : e.g. with ethidium bromide : modifying helical (and hence also superhelical) topology - intercalation of one ethidium+ moiety unwinds the helix by 26° - Van der Waals contacts with the base-pairs above and below. - increased fluorescence compared to the dye in solution UV (254 nm) : absorbed by the bases and transmitted to the dye (302 & 366 nm) : absorbed by the dye itself => fraction of energy re-emitted at 590 nm - binding is reversible, but dissociation is very slow - removal at EtdBr : chromatography over a cation-exchange resin or extraction with isopropanol or n-butanol other examples: actinomycin D, terpyridine, daunomycin, acridines, etc. Denaturation : - the double helix can be denatured in different ways: heat, pH (acid & alkali), chemicals such as urea (poorly), formamide, dimethylsulfoxide, etc., and proteins (single-strand binding proteins) - dissociation by heating also named melting - progress of denaturation can be monitored by UV absorption spectroscopy - stability of helix: - hydrogen bonds - stacking (- interactions) between the planar rings - cavitation energy (the parts of the molecule that cannot form H-bonds to water tend to form a single cavity => minimum cavitation energy) - hyperchromicity effect : disruption of stacking => 30 to 40 % increase of UV (260 nm) absorption - Tm : melting temperature - position in melting profile where 50% is single-stranded - pseudo-monomolecular reaction: strands are not physically separated (but A-T rich zones 'melt' first) (dynamic equilibrium) => 'denaturation is concentration-independent' - linear relationship between Tm and %G+C of a duplex - physical separation requires temperatures far above Tm - fast chilling (ice-water!) : nucleic acid remains single-stranded - slow cooling is necessary to enable the base-pairs (and stacking) to rebuild => 'renaturation requires energy ! ' Renaturation : - renaturation is NOT simply the reversal of denaturation - collision of complementary strands required - nucleic acid strands are negatively charged in the phosphate moiety => -1 per nucleotide : repulsion : requires shielding to allow strands to approach one another (use of Na+ or K+ salts) - four parameters in renaturation kinetics 1) concentration of cations 2) incubation temperature (usually 20 to 25 °C below Tm) 3) DNA concentration (related to complexity of the DNA) 4) size of the fragments - procedure - purified DNA is sheared to uniform sizes (300 bp) (controlled shearing can yield sizes from 100 to 100,000 bp) - denaturing by brief heating at 100 °C (physical separation of the strands) then quick chilling (ice-water) - bring samples in buffer (e.g. 0.15 M of monovalent salt, around pH 7) - follow extent of renaturation as a function of time => hypochromicity (instrumentation at high temperatures!) => sensitivity to nuclease S1 - TCA precipitation of ds-DNA, - fragmented ss-DNA is not precipitated - filtering precipitate => use of hydroxy-apatite (binds ds) or cellulose nitrate (binds ss) - kinetics - bimolecular second order reaction - concentration of (single-stranded) DNA = C initial concentration = C0 - dC/dt = k C C' or = k C2 (if original, denatured DNA is used) (minus sign because rate of disappearance of single strands) hence: - dC/C2 = k dt integration from the initial values C = C0 and t = 0 1/C - 1/C0 = kt and C/C0 = 1/(1+kC0t) C/C0 is the fraction which is still single-stranded (hence a value between 1 and 0) when half of the DNA is renatured, C/C0 is ½, then the formula becomes C0t1/2 = 1/k or k = 1/C0t1/2 Thus the value of k can be experimentally derived from the reassociation curve. This value depends on cation concentration, temperature, fragment size, etc. - C0t became pronounced Cot (hence Cot curve), and also written as Cot. - Similarly, reassociation kinetics with RNA is known (analysed) as Rot curves. - Genomes may contain - unique sequences (single copy) - moderately repeated sequences - highly repetitive DNA - Cot analysis allows characterisation of sequence complexity in terms of different subclasses of sequences dependent on degree of repetitivity, AND fractionation of different sets. remember: 90% change between Cot – 1 and Cot + 1 Further notes: - renaturation between different molecules, RNA or DNA, is possible. Reassociation of an RNA strand to a (complementary) DNA strand was named "hybridization" (forming the hybrid duplex), but this name was soon used for any combination (DNA-RNA, RNA-RNA or DNA-DNA) - analysis of reassociation kinetics is usually done in solution - renaturation is also possible in a two-phase system with one strand immobilised (target) and the other strand in solution. The strand in solution is labelled radioactively (or tagged by chemical structures specifically detectable) and is named the 'probe' (not with DNA chip technologies !) - qualitative and quantitative comparison of nucleic acid sequences is feasible. => the kinetics of renaturation may be quite different and but Tm values still do apply. Tm cfr. y = a + bx where x is the G+C percentage and y is Tm - the slope b depends on the kind of duplex - the constant a depends on o.a. salt conditions - Tm : can be used to define the stringency of hybridisation - stringency expresses conditions that favours/disfavours formation of H-bonds - solution of high stringency : disfavours H-bonds - solution of low stringency : favours H-bonds - at high stringency: high degree of base complementary is necessary to keep the association between probe and template - at low stringency: association is obtained, even at a low degree of base complementarity - Tm as initially described by Marmur: (for G+C between 30 and 75%) Tm = 69.3 + 0.41 [%GC] in 0.15 M NaCl + 0.015 M Na-citrate pH 7.0 = 53.9 + 0.41 [%GC] in 0.015 M NaCl + 0.015 M Na-citrate pH 7.0 (slope unchanged but value of the constant a decreases) - contribution of ionic concentration to value of a, as adapted by Wetmur Tm = 81.5 + 16.6 log[M+] + 0.41 [%GC] if [M+] < 0.4 and then to Tm = 81.5 + 16.6 log{[M+]/(1+0.7[M+])} + 0.41 [%GC] for [M+] up to 1M NaCl - specific problems are created when not all positions in the DNA strands are complementary to the opposite strand (named mismatch positions) => mismatches destabilize the duplex: in first approximation 1% mismatch gives a decrease in Tm of 1 °C. - denaturing agents in the buffer also decreases Tm => formamide causes a decrease of 0.63 °C per percent - for shorter strands, the size of the fragment starts influencing Tm => a factor of (around) -500/n may be added Thus : for DNA-DNA Tm (C°) = 81.5 + 16.6 log{[M+]/(1+0.7[M+])} + 0.41 [%GC] – F – 500/n – P where F is 0.63 x % formamide and P is % mismatch n is the size of the double-stranded segment for RNA-RNA, the formula becomes Tm (C°) = 78 + 16.6 log{[M+]/(1+0.7[M+])} + 0.7 [%GC] – F – 500/n – P where F is 0.35 x % formamide for RNA-DNA hybrids, the formula becomes Tm (C°) = 67 + 16.6 log{[M+]/(1+0.7[M+])} + 0.8 [%GC] – F – 500/n – P where F is 0.5 x % formamide => the slopes are different, as well as the influence of formamide and other factors than ionic conditions. Exercise : when is an RNA-DNA duplex more stable than the corresponding DNA-DNA duplex? With oligonucleotides, these formulas are not applicable because Tm is now a measure of a bimolecular reaction (denaturation dissociates the oligonucleotide from the template). Tm is calculated from thermodynamic values and based on nearest neighbors. Tm (C°) = T°H°/(H°-G°+RT ln[C]) – 269.3 + [salt formula as above] (H° and G° are summed up from a number of separate contributions) (an oligonucleotide of N nucleotides consists of N-1 nearest neighbors) (each mismatch reduces this number by 2) For a long period of time, the 'rule of Wallace' was very often applied: Td = (2 x nAT) + (4 x nGC) but this "rule" is severely restricted to sizes around 17, and hence has been abused very often. This formula should now be considered obsolete.