Function-Information Relationship in Nucleic Acids Andrej Luptak aluptak@uci.edu UNIVERSITY of CALIFORNIA ‧ IRVINE Information flow in biological systems In vitro selection How many solutions are there to a biochemical problem? In vitro selected RNAs Aptamers Organic dyes, amino acids, nucleotides, metabolites Aminoglycosides, peptides, proteins, liposomes Cells, tissues, single-walled nanotubes Transition state analogs Ribozymes Phosphoryl (incl. polymerase), acyl and alkyl transfer Isomerisation, Diels-Alder, nucleotide synthesis, Michael Metal insertion into mesoporphyrin Metal-metal bond formation (palladium nanoparticles) Informational complexity and functional activity How many solutions are there to a biochemical problem? How does one measure complexity? How does one measure structural complexity? And what does this have to do with evolution, biosensors and the origin of life? Hazen et al. PNAS 2007 104 How many solutions are there to a biochemical problem? How many solutions are there to a biochemical problem? Isolation of high-affinity GTP aptamers from partially structured RNA libraries Jonathan H. Davis* and Jack W. Szostak† PNAS 2002 vol. 99 no. 18 How many solutions are there to a biochemical problem? Informational Complexity and Functional Activity of RNA Structures James M. Carothers, Stephanie C. Oestreich,‡ Jonathan H. Davis,† and Jack W. Szostak J.AM.CHEM.SOC. 2004,126, 5130 How does one measure structural complexity? Informational Complexity and Functional Activity of RNA Structures James M. Carothers, Stephanie C. Oestreich,‡ Jonathan H. Davis,† and Jack W. Szostak J.AM.CHEM.SOC. 2004,126, 5130 How does one measure informational complexity? Informational Complexity and Functional Activity of RNA Structures James M. Carothers, Stephanie C. Oestreich,‡ Jonathan H. Davis,† and Jack W. Szostak J.AM.CHEM.SOC. 2004,126, 5130 How does one measure informational complexity? Informational Complexity and Functional Activity of RNA Structures James M. Carothers, Stephanie C. Oestreich,‡ Jonathan H. Davis,† and Jack W. Szostak J.AM.CHEM.SOC. 2004,126, 5130 & RNA 2006 12, 4 How does one measure informational complexity? Informational Complexity and Functional Activity of RNA Structures James M. Carothers, Stephanie C. Oestreich,‡ Jonathan H. Davis,† and Jack W. Szostak Shannon Uncertainty H Pi log 2 Pi i A,U,G,C Information Content= Max Information - Shannon Uncertainty Max Information using 4 bases=2 bit J.AM.CHEM.SOC. 2004,126, 5130 & RNA 2006 12, 4 How does one measure informational complexity? Informational Complexity and Functional Activity of RNA Structures James M. Carothers, Stephanie C. Oestreich,‡ Jonathan H. Davis,† and Jack W. Szostak J.AM.CHEM.SOC. 2004,126, 5130 Informational complexity and functional activity Information Content = Max Information - Shannon Uncertainty Shannon Uncertainty H Pi log 2 Pi i A,U,G,C Max information using 4 bases=2 bits Invariant A: P(A)=0.997 P(C)=0.001 P(G)=0.001 P(U)=0.001 H= -(-0.997*0.00433 - 3*0.001*9.966) = 0.00432+0.0299 = 0.0342 Invariant A or G: P(A)=0.498 P(C)=0.002 P(G)=0.498 P(U)=0.002 H= -(-2*0.498*1.006 - 2*0.002*8.965) = 1.002+0.036 = 1.038 IC= 2 - 0.0342 = 1.9658 IC= 2 - 1.038 = 0.9622 One position in a base-pair: IC=1 bit (a base-pair is 2 bits) One position in a regular or wobble pair: IC=0.5 (1 bit per loose base-pair) Another RNA aptamer example: adenosine aptamer Class II ligase ribozyme Pitt & Ferré-D’Amaré, J. Am. Chem. Soc., 2009, 131 (10), pp 3532–3540 Class II ligase ribozyme Rapid Construction of Empirical RNA Fitness Landscapes Jason N. Pitt and Adrian R. Ferré-D’Amaré* Science 2010: Vol. 330 no. 6002 pp. 376-379 Evolution is an adaptive walk through a hypothetical fitness landscape Fitness landscape shows the relationship between genotypes and the fitness of each corresponding phenotype Empirical fitness landscape is determined for a catalytic RNA by combining next-generation sequencing, computational analysis, and “serial depletion,” an in vitro selection protocol Abundance in serially depleted pools correlates with biochemical activity MS = a4-11 master sequence of the ligase ribozyme Class II ligase ribozyme Rapid Construction of Empirical RNA Fitness Landscapes Jason N. Pitt and Adrian R. Ferré-D’Amaré* Science 2010: Vol. 330 no. 6002 pp. 376-379 Changes in population structure during serial depletion (in vitro selection) Class II ligase ribozyme Rapid Construction of Empirical RNA Fitness Landscapes Jason N. Pitt and Adrian R. Ferré-D’Amaré* Science 2010: Vol. 330 no. 6002 pp. 376-379 Correlation of genotype frequency and experimental rate constants Histogram of correlation coefficients of kobs (n = 135 point mutants) with randomly reassorted mutation frequencies Class II ligase ribozyme Rapid Construction of Empirical RNA Fitness Landscapes Jason N. Pitt and Adrian R. Ferré-D’Amaré* Science 2010: Vol. 330 no. 6002 pp. 376-379 Information content per position of the class II ligase ribozyme In vitro selection of ribozymes Optimized for single-turnover enzymes In vitro selected RNAs Aptamers Organic dyes, amino acids, nucleotides, metabolites Aminoglycosides, peptides, proteins, liposomes Cells, tissues, single-walled nanotubes Transition state analogs Ribozymes Phosphoryl (incl. polymerase), acyl and alkyl transfer Isomerisation, Diels-Alder, nucleotide synthesis, Michael Metal insertion into mesoporphyrin Metal-metal bond formation (palladium nanoparticles) In vitro selected ribozymes ribozyme Diels-Alderase protein enzyme Serganov et. al. Nature Structural & Molecular Biology 2005, V 12, pp 218 - 224 Ligase (Bartel & Szostak, Science, 1993) RNA polymerase (Johnston & Bartel, Science 2001) Polynucleotide kinase (Lorsch & Szostak, Nature 1994) Diels-Alderase (Agresti & Griffiths, PNAS 2005) All of these multiple-turnover ribozymes were converted from single-turnover isolates Informational complexity and functional activity: Peptides Information Content = Max Information - Shannon Uncertainty Shannon Uncertainty H Pi log 2 Pi Max information using 20 amino acids=4.3219 bits or 1.301 dits (base 10) i A,U,G,C i Ala...Trp Informational complexity and functional activity: Peptides Information Content = Max Information - Shannon Uncertainty Shannon Uncertainty H Pi log 2 Pi Max information using 20 amino acids=4.3219 bits or 1.301 dits (base 10) Almost Invariant Glycine: P(Gly)=0.9981 P(Ala)=P(Arg)=P(Asn)=...=P(Val)=0.0001 H= -(-0.9981*0.002744 - 19*0.0001*13.28) = 0.002739+0.02523 = 0.05262 IC= 4.3219 - 0.0526 = 4.2693 i A,U,G,C i Ala...Trp Informational complexity and functional activity: Peptides # possible AAs Shannon Uncertainty Information Content 1 0.0000 4.3219 2 1.0000 3.3219 3 1.5850 2.7369 4 2.0000 2.3219 5 2.3219 2.0000 6 2.5850 1.7369 7 2.8074 1.5145 8 3.0000 1.3219 9 3.1699 1.1520 10 3.3219 1.0000 11 3.4594 0.8625 12 3.5850 0.7369 13 3.7004 0.6215 14 3.8074 0.5145 15 3.9069 0.4150 16 4.0000 0.3219 17 4.0875 0.2344 18 4.1699 0.1520 19 4.2479 0.0740 20 4.3219 0.0000 Peptide functions to consider: What’s the information content of a His-tag? What’s the information content of an HPQ streptavidin tag? What about two HPQ tags? A cystine bridge? What’s the information content of a hydrophobic position? And charged? What about a salt bridge? Small domains: zinc finger Structure of the model peptide and of the residues incorporated at the guest position Comparison of the enthalpy of helix formation Δhα obtained from different peptides Copyright © 2005, The National Academy of Sciences Richardson J. M. et.al. PNAS 2005;102:1413-1418 Informational complexity and functional activity: Peptide secondary structure # possible AAs Shannon Uncertainty Information Content 1 0.0000 4.3219 2 1.0000 3.3219 3 1.5850 2.7369 4 2.0000 2.3219 5 2.3219 2.0000 6 2.5850 1.7369 7 2.8074 1.5145 8 3.0000 1.3219 9 3.1699 1.1520 10 3.3219 1.0000 11 3.4594 0.8625 12 3.5850 0.7369 13 3.7004 0.6215 14 3.8074 0.5145 15 3.9069 0.4150 16 4.0000 0.3219 17 4.0875 0.2344 18 4.1699 0.1520 19 4.2479 0.0740 20 4.3219 0.0000 Copyright © 2005, The National Academy of Sciences Richardson J. M. et.al. PNAS 2005;102:1413-1418 Informational complexity and functional activity: Peptide secondary structure # possible AAs Shannon Uncertainty Information Content 1 0.0000 4.3219 2 1.0000 3.3219 3 1.5850 2.7369 4 2.0000 2.3219 5 2.3219 2.0000 6 2.5850 1.7369 7 2.8074 1.5145 8 3.0000 1.3219 9 3.1699 1.1520 10 3.3219 1.0000 11 3.4594 0.8625 12 3.5850 0.7369 13 3.7004 0.6215 14 3.8074 0.5145 15 3.9069 0.4150 16 4.0000 0.3219 17 4.0875 0.2344 18 4.1699 0.1520 19 4.2479 0.0740 20 4.3219 0.0000 Beta-sheet formation propensity (from Minor&Kim Nature 1994) High Thr, Ile, Tyr, Phe, Val, Met, Ser Medium Trp, Cys, Leu, Arg Low Lys, Gln Negative propensity (sheet breakers) Gly, Pro