P-Sheet Structure and Formation of Fibers

advertisement

Structure and Formation of P-Sheet Fibers

by

Davide M. Marini

Laurea, Industrial Engineering

Politecnico di Milano, Italy, 1995

Submitted to the Department of Mechanical Engineering in partial fulfillment of the requirements for the degree of

Doctor of Philosophy in Mechanical Engineering at the

Massachusetts Institute of Technology

September 2003

© Massachusetts Institute of Technology, 2003. All rights reserved.

Signature of Author

Certified by

Accepted by

Department of Mechanical Engineering

August 25, 2003

Roger D. Kamm

Professor of Mechanical Engineering and Biological Engineering

Thesis Supervisor

Ain A. Sonin

Professor of Mechanical Engineering

Chairman, Committee for Graduate Studies

MASSACHUSETTS INSTITUTE

OF TECHNOLOGY

OCT 0 6 2003 BARKER

LIBRARIES

Structure and Formation of P-Sheet Fibers

by

Davide M. Marini

Submitted to the Department of Mechanical Engineering on August 25 th

2003, in partial fulfillment of the requirements for the degree of

Doctor of Philosophy in Mechanical Engineering

Abstract

The spontaneous organization of protein monomers into fibers, bundles and networks is central to biology. Inspired by the repetitive patterns found in the amino acid sequence of natural fibrous proteins, short peptides have been designed to self-assemble in aqueous solution into gelatinous matrices. Such hydrogels appear under the electron microscope as networks of finely interwoven fibers and have demonstrated great potential for tissue engineering applications. This thesis addresses the experimental and theoretical characterization of this selfassembly process at the molecular scale.

Structural characterization of single fibers from these hydrogels was performed to elucidate their molecular architecture and self-assembly mechanism. In the case of the molecule mainly studied in this research (KFE8), atomic force microscopy and quick-freeze/deep-etch revealed that mature fibers are formed through intermediate steps in which the fiber structure (a left-handed helical ribbon of diameter ~ 7 nm and helical pitch ~ 19 nm) differs markedly from its final form (tubular). These results, in conjunction with molecular dynamics simulations and circular dichroism, suggest that such intermediates are comprised of two helical P-sheet layers that sandwich hydrophobic side chains in between them. Small variations in amino acid side chains were also found to have substantial effects on fiber structure and self-assembly.

The kinetics of matrix self-assembly were elucidated by means of an analytical treatment and numerical simulations. A Smoluchowski-type mean-field description of fiber nucleation and growth was developed to relate average fiber length to the fundamental rate constants of the process: in the limit of the elongation rate constant being much larger than the nucleation rate constant, the average fiber length was found to be proportional to the square root of their ratio.

A Brownian dynamics model was also developed, based on explicit description of particle motion, allowing simulation of fiber self-assembly in conditions where diffusion becomes a limiting factor. The average fiber length computed from simulations was shorter than the one predicted analytically. In order to explain such discrepancy, a scaling argument was developed that takes into account the inhomogeneities introduced in the system by the process of selfassembly.

Thesis Supervisor: Roger D. Kamm

Title: Professor of Mechanical Engineering and Biological Engineering

3

4

Acknowledgements

The years I have spent at MIT changed my life forever. The passion, intelligence and humbleness of the friends I made here opened my mind to a deeper sense of wonder for the world and for human nature. At times I felt almost consumed by the pace of learning one can experience here; nevertheless, as expressed by my favorite Italian poet, drowning in this sea of knowledge was sweet. I am indebted to many people for this experience.

I am profoundly grateful to my advisor, Roger Kamm, for his guidance, patience and support during these years, and for allowing me the freedom to pursue new research directions.

This thesis would not have been possible without his inquisitive and open-minded attitude.

Judging by the amount of time he dedicated to helping me grow both professionally and personally I would conclude that I was his only student, though I am sure all his other students feel the same.

I am greatly indebted to Shuguang Zhang for his generosity, insight and optimism. His constant support and our endless conversations brightened my days in the lab; his positive attitude was very precious when experiments did not work as expected. I also wish to gratefully acknowledge Dr. Joel Schnur, of the Naval Research Labs, for agreeing to be on my committee.

I am especially grateful for his enthusiasm, honest criticism and patience. He has been a fatherly figure for me: "un generale dal cuore d'oro". I wish to thank Prof. L. Mahadevan for his invaluable insight and suggestions during the course of this research and for helping me understand the importance of asking the right questions.

Mark Bathe has been essentially a brother for me during all these years. From the PhD qualifying exams to my defense, Mark was a constant, essential source of support. Our endless conversations and his blunt attitude helped me greatly to clarify my thinking. I thank him deeply for being so supportive and generous. I also wish to gratefully acknowledge his father, Prof.

Klaus-Jiirgen Bathe, and mother, Dr. Zorka Bathe, for adopting me as a new son in their family: being invited for dinner so often was extremely helpful, especially when I felt homesick.

Wonmuk Hwang helped me in this thesis to a degree that cannot be overstated: his contribution to the analytical and numerical sections was essential. Michael Caplan taught me the importance of dividing a problem into simpler components and how to operate in a chemistry lab. I also thank him for our innumerable conversations and his willingness to help me at any time.

I am very grateful to Prof. Patrick Doyle for his essential help in the development of the

Brownian dynamics code. I am indebted to Kimberly Hamad-Schifferli for all the times I walked in her office with a question and promptly found help. I thank Prof. Jonathan King for instilling in me the passion for the protein folding problem and Prof. David Gossard for his generous help in discussing career decisions with me. I am also grateful to Prof. George

Benedek for our discussions on self-assembly and amyloid formation and to Prof. Haiyan Gong for her help with the electron microscope. I thank Elizabeth Shaw for her patience when teaching me how to use the atomic force microscope.

My friends and family also contributed immensely to my happiness during these years and I wish to acknowledge them here. I thank Amalia Branca for understanding my dreams and for inspiring me to follow them. Andrew and Cecile Schiermeier gave me the strength and selfconfidence I needed to take a plunge in the world of research and pursue a PhD. I thank my advisor at Northwestern University, Prof. Sandro Mussa-Ivaldi, and his postdoctoral student

5

Vittorio Sanguineti for encouraging me to apply to graduate school. I am indebted to Patrizia

Canziani for her support of my choice to leave investment banking and I thank Bertrand des

Pallieres and Andrea Vella of J. P. Morgan for understanding this choice.

I deeply thank Gaetano Bertoldi, Michele Zanolin and Federico Frigerio, whose friendship helped me overcome the difficulties of being away from my beloved Italy. I am very grateful to John and Dawn Demerly for the time they dedicated to making me feel at home. I wish to thank Leslie Regan for her incredible efficiency and kindness during all my years at

MIT. I thank Jeffrey Ruberty, Darryl Overby, Barbara Ressler, Jeremy Teichman, Thomas

Heldt, Pirouz Kavehpour, Ryan and Catherine Jones, Franz Heukamp, Roberto Girelli, Valeria

Valli, Olga Vayena, Waleed Farahat, Mayssam Ali, Michael Sachinis, Francesca Gasparini,

Roberto Accorsi, Andrea Kraay, Walter Lironi, Giovanni Bonfanti, Alessandro Araldi, Birgit

Schoeberl, Nikola Kojic, Steve Santoso, Carlos Semino, John Kisiday, Melody Swartz, Sile

O'Modhrain, Constance Parvey, Neda Vukmirovic, Andrea Gabrielle, Lourdes Nufiez de Prado,

Monica Boselli, Nicola Tegoni, Peter Mack, Gina Kim, Jennifer Blundo, Ahmad Khalil, Helene

Karcher, Susanna Baker, Tulika Khemani, Veronica Asnaghi, Walter Rantner, Andrea and Luigi

Adamo, Anna Custo, Gaia Colombo, Kerri and Joshua Marmol, Jan Lammerding, Borja Larrain,

Francisco Marty and Shannon Fanning for so many great discussions about life. I am indebted to

Fr. William Brown for his moral support and to Fr. Tadeusz Pacholczyk for answering all my questions about philosophy of science. The energetic and inspiring conversations I had with

Ngon Dao and Stephanie Popp on entrepreneurship, ethics and good food changed my model of the world.

Finally, I wish to thank my family for giving me unconditional love and support. I will be forever grateful to my sister Amneris for teaching me to be unafraid of asking questions and for inspiring my love of nature. I thank my brother Valeriano for encouraging me to always aim higher. I dedicate this thesis to my parents, Evaristo Marini and Giulia Rinaldi, for their unswerving dedication to prepare a better future for their children.

6

CONTENTS

CHAPTER 1 INTRODUCTIO N ...............................................................................

1.1 Self-assem bly....................................................................................................

1.2 Self-assem bling peptides..................................................................................... 13

1.2.1 M icrostructure of self-assem bled hydrogels................................................. 14

11

13

1.2.2 Tissue engineering applications...................................................................

1.2.3 Gelation m echanism .....................................................................................

1.2.4 M olecular structure of the fibers ................................................................

1.2.5 Rational design of peptide m atrices..........................................................

1.3 Thesis outline....................................................................................................

References for Chapter 1.........................................................................................

17

18

15

16

19

22

CHAPTER 2 CHARACTERIZATION OF $-SHEET FIBERS FORMED BY SELF-

ASSEM BLING PEPTIDES ......................................................................................... 25

2.1 Introduction...........................................................................................................

2.2 Experim ental m ethods.......................................................................................

Peptide synthesis and sam ple preparation ...........................................................

Atom ic force m icroscopy (AFM ).........................................................................

25

26

28

29

30 Quick-freeze / deep-etch (QFDE) .......................................................................

Circular dichroism (CD) .........................................................................................

Im age analysis.......................................................................................................

Other m ethods.......................................................................................................

2.3 Results ..................................................................................................................

32

32

31

31

7

2.3.1 Structure of self-assembled fibers in aqueous solution............................... 32

2.3.2 Fiber length distribution............................................................................. 39

2.3.3 Secondary structure form ation .................................................................... 41

43 2.3.4 Self-assem bly at low concentration............................................................

2.3.5 Effect of temperature................................................................................... 45

2.3.6 Self-assem bly at neutral pH.......................................................................

2.3.7 Influence of synthesis method .....................................................................

2.3.8 Effect of solvent com position.....................................................................

48

49

54

57 2.3.9 Deposition onto graphite .............................................................................

2.3.10 Effect of side chain variation.....................................................................

2.3.11 Variation in m olecular pattern...................................................................

59

63

2.4 Discussion.............................................................................................................

2.4.1 Structures form ed by KFE8.........................................................................

65

66

2.4.2 M olecular dynamics simulations for KFE8 ................................................ 67

2.4.3 Transition from ribbons to tubules ................................................................. 69

2.4.4 Concentration and temperature dependence ................................................ 70

2.4.5 M olecular structure variation ...................................................................... 71

2.4.6 Network formation ....................................................................................... 72

2.4.7 Considerations from other areas of research................................................... 73

2.5 Suggested directions for future research............................................................ 76

2.6 Conclusion ............................................................................................................

References for Chapter 2.........................................................................................

78

80

8

CHAPTER 3 A MEAN-FIELD DESCRIPTION OF FIBER SELF-ASSEMBLY ....... 85

3.1 Introduction........................................................................................................... 85

3.2 Sm oluchowski coagulation theory ....................................................................

3.3 Derivation of average fiber length.....................................................................

3.4 Discussion.............................................................................................................

86

88

95

99 3.5 Conclusion ............................................................................................................

References for Chapter 3........................................................................................... 101

CHAPTER 4 SIMULATION OF FIBER SELF-ASSEMBLY ................................... 105

4.1 Introduction......................................................................................................... 105

4.2 Sim ulation m odel................................................................................................

4.3 Sim ulation m ethods.............................................................................................

105

109 n-sheet nucleation .................................................................................................

Brownian dynam ics...............................................................................................

Interaction potential ..............................................................................................

Dim ensional analysis ............................................................................................

Integration.............................................................................................................

Initial and boundary conditions.............................................................................

Testing the algorithm ............................................................................................

System size ...........................................................................................................

4.4 Results and discussion ........................................................................................

Kinetics of fiber growth ........................................................................................

Average fiber length..............................................................................................

109

111

112

113

118

118

119

120

122

123

126

9

Tim e to equilibrium .............................................................................................. 129

Com parison with m ean-field predictions............................................................... 131

4.5 Suggested directions for future research.............................................................. 134

4.6 Conclusion .......................................................................................................... 135

References for Chapter 4........................................................................................... 139

CH APTER 5 CONCLUSIONS................................................................................... 143

APPENDIX THE SIM ULATION CODE...................................................................

Header file (header.h) ...........................................................................................

M ain code (m ain.cpp)...........................................................................................

Initialization file (init.cpp) ....................................................................................

Force com putation file (forces.cpp) ......................................................................

Random number generator (rng.cpp).....................................................................

Visualization file (disp.vm d).................................................................................

147

147

149

154

157

162

164

10

CHAPTER

1

INTRODUCTION

Living organisms rely on sophisticated materials, such as bone, cartilage and the various types of extracellular matrix tissue, to provide a scaffold for resident cells and to maintain physical integrity. Such materials are characterized by several levels of structural organization, from single molecules upwards, that can result in an extraordinary range of mechanical properties (Figure 1-1). The capacity of certain proteins to self-assemble into fiber networks is essential for building these complex biomaterials.

~

Collagen triple helix

10 nm

Bone-marrow stem cell

500 nm

Hydroxyapatlte crystals cell o

Extracellular bone matrix

100 pm

Figure 1-1. An example of a natural material whose properties depend on its nanoscale structure.

Bones are characterized by many scales of structural organization: collagen triple helices are assembled through precise alignment and folding of three polypeptide chains; such triple helices in turn self-assemble into bundles which act as templates for the crystallization of hydroxyapatite

(from reference 1).

11

This thesis focuses on a class of short artificial proteins (peptides) designed to selfassemble in aqueous solution into networks of nanoscale fibers. These self-assembling peptides are interesting for at least three reasons:

(1) The fibrous matrix generated by these peptides can be used as a scaffold for cell attachment and has shown great potential in tissue engineering applications.

(2) The ability of these molecules to build a network of fibers is remarkably similar to that of more complex natural proteins, such as actin and tubulin. Given their relatively simple molecular structure (they are usually 8 to 12 amino acids long), it is not inconceivable that the mechanism of their self-organization be understood and harnessed for biological engineering purposes.

(3) The fibers formed by these peptides have the same basic architecture (f-sheet) as the ones characterizing Alzheimer's disease, Parkinson's disease and other neurodegenerative disorders. Elucidating the mechanism by which certain peptides form -sheet fibers will advance our understanding of these conditions.

The goals of this thesis are:

(a) to elucidate the structure of the fibers self-assembled from these peptides and the mechanism of their formation;

(b) to develop predictive tools for the rational design of peptide-based biomaterials.

(c) to employ these tools in a particular example of peptide self-assembly.

12

1.1 Self-assembly

Self-assembly can be defined as the spontaneous organization of individual components into an ordered structure without human intervention [2]. The nucleation and growth of crystals can be considered a simple example of self-assembly [3]. Much more sophisticated instances are observed in biology: from protein folding [4,5] to embryo development [6,7], living organisms rely extensively on self-assembly to generate complex, multicomponent structures. Life itself emerged on earth possibly through a process of self-assembly [8]. This process is also a compelling route to the fabrication of small structures that could not be produced by traditional manufacturing; it is therefore important to understand the principles governing self-assembly and the strategies used by living organisms to harness its potential.

1.2 Self-assembling peptides

The hierarchical organization of protein monomers into long filaments, bundles and networks is a self-assembly process of central importance in nature. Inspired by the repetitive patterns found in the sequence of natural fibrous proteins (such as collagen [9] and spider silk

[10]) short peptides have been designed based on specific patterns of alternating hydrophobic and hydrophilic amino acids to self-assemble in aqueous solution into fibrous matrices [11-17].

An example of a self-assembling peptide is given in Figure 1-2: this molecule (named

KFE8) is designed with a double repetition of the amino acid sequence FKFE, where F is phenylalanine (hydrophobic), K is lysine and E is glutamic acid (both charged at pH 7). In this particular conformation (called a f-strand) all hydrophobic side chains lie on one side, opposite to the hydrophilic ones, making the molecule amphiphilic.

13

-, -n'

Figure 1-2. Molecular model of a self-assembling peptide (KFE8) designed with the amino acid sequence FKFE-FKFE (F: phenylalanine, K: lysine, E: glutamic acid). Carbon is light blue,

Nitrogen is blue, Oxygen is red and Hydrogen is white. F is hydrophobic (phenyl rings), K and E are both charged at pH 7. The molecule is portrayed in P-strand conformation.

The molecule KFE8 belongs to a class of peptides, called self-complementary, originally discovered by Shuguang Zhang [11]. Previous studies had shown that polypeptide chains characterized by alternating hydrophobic and hydrophilic amino acids tend to form P-sheet structures [18]. By exploiting this tendency, and by drawing inspiration from a sequence observed in a DNA-binding protein, Zhang realized that a self-complementary pattern in the charged side chains (+ + -, ++ ++ -, etc.) promotes the formation of highly stable matrices (hydrogels). By design of appropriate amino acid sequences, it is therefore possible to induce peptides to self-assemble into macroscopic structures.

1.2.1 Microstructure of self-assembled hydrogels

When dissolved in water, even at concentrations as low as 1% wt, self-assembling peptides form gelatinous structures. These gels appear under the electron microscope as networks of finely interwoven, nanometer-scale fibers (Figure 1-3). The typical elastic modulus of these membranes is of order 1,000 Pa [13] and the matrix pore size is approximately 100 nm.

14

Figure 1-3. Transmission electron micrograph of the hydrogel formed by the self-assembling peptide KFE8. Scale bar is 200 nm (courtesy of M. Caplan, PhD Thesis 2001).

1.2.2 Tissue engineering applications

One of the most promising applications of these self-assembled hydrogels is in tissue engineering. Zhang and coworkers demonstrated that these matrices support attachment and growth of a wide variety of mammalian cells. Neuronal cells, for example, have been shown to grow axons and to form active synapses within the matrix [14]. Chondrocytes can grow within the hydrogel and develop an extracellular matrix similar to cartilage [16].

The reason why these gels show such favorable interaction with cells is not completely understood. The fragility of the matrix (held together by non-covalent bonds) is believed to be one of the reasons: cells may easily displace the fibers and produce their own extracellular

15

matrix. These hydrogels can therefore be used as three-dimensional scaffolds to guide cell growth.

Peptide hydrogels posses many advantages over traditional biomaterials (such as, for example, collagen-derived matrices or polyethylene oxide). (a) The fiber diameter and the pore size of the matrix are much smaller than the typical dimensions of cells, which therefore perceive the matrix as a true three-dimensional environment. (b) Unlike collagen-based fibers, which are animal-derived, these materials are produced via synthetic chemistry: the risk of harboring pathogens is thus eliminated. (c) The molecular building blocks can in principle be designed to generate fiber networks of pre-specified properties, tailored to interact specifically with various cell lines.

1.2.3 Gelation mechanism

Several factors are known to promote self-assembly, although the mechanism by which these peptides coalesce to form a network is not clearly understood. It is known, for example, that the process occurs instantaneously upon addition of a sufficient concentration of electrolytes in solution. Other factors promoting fiber stability are: hydrogen bonds between peptide backbones, ionic bonds between oppositely charged side chains, hydrophobic interactions and possibly coordination bonds mediated by ions in solution [14].

Caplan et al. explained the dependence of self-assembling behavior of the molecule

KFE12 (a triple repeat of the sequence FKFE) on pH in terms of the Derjaguin-Landau-Verwey-

Overbeek (DLVO) theory [15]. He hypothesized that self-assembly of this molecule is promoted

by the hydrophobic effect, but hindered by electrostatic repulsions (the molecule is positively charged at low pH). Molecules should then remain unassembled if the electrostatic repulsion dominates over the hydrophobic attraction. When charges on the molecules are screened through

16

the addition of negative ions, the hydrophobic effect dominates and molecules self-assemble. A similar reasoning predicts that self-assembly should occur when the solution pH is such that molecules carry zero net charge. All these predictions were confirmed and data agreed quantitatively with the theory.

1.2.4 Molecular structure of the fibers

A model was proposed in the original publication [11] to explain how molecules are arranged within the fibers. In this model (Figure 1-4) a series of ionic bonds and hydrophobic forces holds the fiber together along its axis. Molecules may be staggered, conferring stability to the fiber.

Figure 1-4 Original model of molecular interaction in the peptide fibers (adapted from reference

11). The amino acid sequence in these molecules follows the pattern H+H+H-H-, an alternation between hydrophobic and charged side chains. When in f-strand conformation, such molecules can be represented as amphiphilic bricks. Green areas represent hydrophobic side chains. The fiber axis runs horizontally in the picture. Several layers of this arrangement may stack perpendicular to the page, held together by backbone hydrogen bonds.

17

1.2.5 Rational design of peptide matrices

The properties of the matrix self-assembled from these peptides depend on a wide array of factors, some of which are related to solution conditions, while others are intrinsic to the structure of the molecules. For example, the presence of a sufficient concentration of electrolytes in solution is known to increase the rate of matrix formation [11-16]. The degree of hydrophobicity of the side chains has also been found to promote gelation [19,20]. Moreover, the number of pattern repeats in the molecule can change the stiffness of the formed matrix by orders of magnitude [19]. All these results were obtained by measuring the properties of the matrix in the bulk.

In general, understanding how molecular interactions give rise to macroscopic properties is a fundamental step in the design of novel materials [21]. The case of these peptides is particularly interesting, due to their potential in biological engineering applications. If properties of these materials could be designed at the level of single fibers, it is not inconceivable that biomaterials with unprecedented characteristics could be produced. For example, one can imagine fibers that degrade depending on the stage of development of embedded cells; or tubular fibers that release specific cell-signaling molecules upon sensing a change in the environment. The potential of f-sheet peptides in the electronics industry has also been demonstrated: by exploiting the ability of these molecules to self-organize into nanometer-scale fibers, it was possible to cast nanometer scale, metallic wires [22]. In principle, through knowledge of the self-assembly mechanism and by design of the building blocks, it should be possible to design materials that respond to their environment at the molecular scale.

18

1.3 Thesis outline

The principal motivation for this research is the development and demonstration of predictive tools for the design of self-assembled peptide matrices for use in tissue engineering.

The following is an outline of the approach taken to achieve this goal.

The macroscopic properties of these networks must depend on the mechanical properties of the single fibers and on fiber-fiber interactions. In order to infer design principles for these biomaterials, it is important to elucidate both effects. In this thesis we chose to elucidate how single fibers are assembled from their building blocks for two main reasons. (1) Compared to our understanding of the factors promoting self-assembly, our understanding of how molecules are arranged in the fibers is much less advanced. Yet, the details of molecular packing are likely to determine fiber structure and properties. (2) Elucidating how biomolecules interact with one another to build more complex structures may have far-reaching consequences. For example, self-assembly may be used as a compelling alternative to traditional fabrication in the production of nanoscale structures. Moreover, the development of biomaterials from single molecules upwards may allow the creation of fibers that direct the growth or interact intelligently with cells.

The first step towards elucidating how fibers are assembled is their structural characterization. The resolution of electron micrographs of the gel obtained by traditional TEM

(Figure 1-3) is not adequate to resolve molecular packing; moreover, the aggressive procedures used for sample preparation may alter the delicate structure of the fibers. Chapter 2 describes experiments aimed at elucidating fiber architecture at the nanometer scale by means of techniques chosen to minimize sample processing. These results led to the proposal of a more refined molecular packing model of the fibers formed by the peptide KFE8.

19

The second fundamental step in understanding fiber self-assembly is elucidating the kinetics of the process. All previous experiments on the gel were done under conditions of complete gelation: the ionic strength of the solution was increased to obtain a fully formed hydrogel and various tests were then performed. The kinetics of the process had not been studied, although they are likely to influence the final structure of the scaffold. We believe important design principles can be inferred by investigating the progression of self-assembly over time. For this reason the second part of this research was dedicated to investigating the kinetics of self-assembly. The first step in this direction was the development of a

Smoluchowski-type mean-field approximation of the process, which is presented in Chapter 3.

Based on simplifying assumptions, an analytical relation is presented that expresses average fiber length as a function of two rate constants assumed to be fundamental in the process.

The experiments described in this thesis also showed that precise control of self-assembly conditions is very difficult to achieve: slight differences in the synthesis process of the molecule can result in quite different behavior (probably due to residual compounds from synthesis of the molecules). Yet, precise control of self-assembly conditions is essential to perform kinetics measurements. These considerations led toward the avenue of computer simulation.

Simulations represent a scientific methodology where complete control over experimental conditions is possible by definition; the result of a simulation can be considered an experimental observation performed on a highly idealized system. Once validated, simulations can be considered predictive tools [23]. In Chapter 4 a computer simulation model is developed, aimed at predicting the behavior of self-assembling peptides in conditions that a mean-field theory cannot capture. These simulations allowed to characterize the evolution of fiber length

20

distribution over time: from diffusing monomers up to the formation of a fiber network. Chapter

5 contains an overview of the work and concluding remarks.

21

References for Chapter 1

[1] Taton, T. A. Boning up on biology. Nature 412, 491(2001)

[2] Whitesides, G. M. and Boncheva, M. Beyond molecules: self-assembly of mesoscopic and macroscopic components. Proc. Natl. Acad. Sci. USA 99, 4769

(2002).

[3] La Mer, V. K. Nucleation in phase transitions. Industrial and Engineering

Chemistry 44, 1270 (1952).

[4] Anfinsen, C. B. Principles that govern the folding of protein chains. Science 181,

223 (1973).

[5] Baker, D. Surprising simplicity of protein folding. Nature 405, 39 (2002).

[6] Pearson, H. Your destiny, from day one. Nature 418, 14 (2002).

[7] Stern, C. D. Fluid flow and broken symmetry. Nature 418, 29 (2002).

[8] Szostak, J. W., Bartel, D. P. and Luisi, P. L. Synthesizing life. Nature 409, 387

(2001).

[9] Rich, A. and Crick, F. The molecular structure of collagen. J. Mol. Bio. 3, 483

(1961).

[10] Lazaris, A., Arcidiacono, S., Huang, Y., Zhou, J. F., Duguay, F., Chretien, N.,

Welsh, E. A., Soares, J. V. and Karatzas, C. N. Spider silk fibers spun from soluble recombinant silk produced in mammalian cells. Science 295, 472 (2002).

[11] Zhang, S., Holmes, T., Lockshin, C. and Rich, A. Spontaneous assembly of a selfcomplementary oligopeptide to form a stable microscopic membrane. Proc. Natl.

Acad. Sci. USA 90, 3334 (1993).

[12] Zhang, S., Lockshin, C., Cook, R. and Rich, A. Unusually stable $-sheet formation in an ionic self-complementary oligopeptide. Biopolymers 34, 663 (1994)

[13] Leon, E. J., Verma, N., Zhang, S., Lauffenburger, D. A. and Kamm, R.D.

Mechanical properties of a self-assembling oligopeptide matrix. Journal of

Biomaterial Science Polymer Edition 9, 297 (1998).

[14] Holmes, T. C., de Lacalle, S., Su, X., Liu, G., Rich, A. and Zhang, S. Extensive neurite outgrowth and active synapse formation on self-assembling peptide scaffolds. Proc. Natl. Acad. Sci. USA 97, 6728 (2000).

22

[15] Caplan, M. R., Moore, P.N., Zhang, S., Kamm, R. D. & Lauffenburger, D. A. Selfassembly of a

P-sheet

protein is governed by relief of electrostatic repulsion relative to van der Waals attraction. Biomacromolecules 1, 627 (2000).

[16] Kisiday, J., Jin, M., Kurz, B., Hung, H., Semino, C., Zhang, S. and Grodzinsky, A.

J. Self-assembling peptide hydrogel fosters chondrocyte extracellular matrix production and cell division: implications for cartilage tissue repair. Proc. Nati.

Acad. Sci. USA 99, 9996 (2002).

[17] Marini, D. M., Hwang, W., Lauffenburger, D. A., Zhang, S. & Kamm, R. D. Lefthanded helical ribbon intermediates in the self-assembly of a P-sheet peptide. Nano

Letters 2, 295 (2002).

[18] Peggion, E., Cosani, A. Terbojevich, M. and Borin, G. Conformational studies on polypeptides. The effect of sodium perchlorate on the conformation of poly-Llysine and of random copolymers of L-lysine and L-phenylalanine in aqueous solution. Biopolymers 11, 633 (1972).

[19] Caplan, M. R. PhD Thesis, Massachusetts Institute of Technology (2001).

[20] Caplan, M. R.; Schwartzfarb, E. M.; Zhang, S.; Kamm, R. D.; Lauffenburger, D. A.

Control of Self-assembling Oligopeptide Matrix Formation Through Systematic

Variation of Amino Acid Sequence. Biomaterials 23, 219 (2002)

[21] Petka, W. A., Harden, J. L., McGrath, K. P., Wirtz, D. and Tirrell, D. A. Reversible hydrogels from self-assembling artificial proteins. Science 281, 389 (1998).

[22] Reches, M. and Gazit, E. Casting metal nanowires within discrete self-assembled peptide nanotubes. Science 300, 625-627 (2003)

[23] Auer, S. and Frenkel, D. Prediction of absolute crystal-nucleation rate in hardsphere colloids. Nature 409, 1020 (2001).

23

24

CHAPTER 2

CHARACTERIZATION OF

P-SHEET

FIBERS FORMED BY

SELF-ASSEMBLING PEPTIDES

2.1 Introduction

Control of peptide self-assembly would allow for the design of biomaterials with unprecedented properties. One can imagine, for example, scaffolds that degrade upon sensing the production of extracellular matrix by host cells. Hollow fibers could also release signaling factors specifically to those cells that have reached a certain stage in their cycle and guide their development. Moreover, self-assembled structures can be used as templates for casting nanoscale structures [1], directing the growth of inorganic crystals [2] and to produce selfrepairing biomaterials [3]. Designing such materials requires an understanding of their molecular architecture and self-assembly mechanism.

In general, determining the supramolecular structure of peptide fibers is difficult because they do not form crystals: x-ray diffraction patterns reveal only rough features of their molecular architecture [4]. Solution-phase NMR is also unsuitable, due to the large size of these supramolecular aggregates (see methods). The structure of these peptide membranes has been studied so far using traditional TEM, which requires chemical cross-linking of the molecules, staining with heavy metals, degradation of peptides with acid and drying in ethanol. This procedure allowed to characterize the microstructure of the matrix and to estimate pore size and fiber diameter (Figure 1-2) [5]. Nevertheless, one cannot be certain that this method leaves the

25

molecular architecture unchanged. Moreover, the resolution of the resulting micrographs is not high enough to guide in the formulation of a model for the molecular packing of these fibers.

The experiments reported in this chapter were aimed at characterizing the structure of single fibers in the matrix. Experimental techniques were chosen to minimize sample processing and the resolution achieved provided the foundation for a detailed molecular packing model for such fibers. Such experiments also allowed, for the first time, to characterize the kinetics of selfassembly and revealed that P-sheet fibers formation is a complex process, where different structures coexist at the same time and mature fibers are formed through intermediate steps.

2.2

Experimental methods

The main objective of this study was the characterization of the architecture and selfassembly mechanism of fibers generated by P-sheet peptides. In order to achieve this goal it was essential to (1) isolate single fibers for observation and (2) slow down the rate of their formation.

This was possible by operating at concentrations circa 10 times lower than what is usual for tissue engineering applications (where the concentration used is approximately 10 mM [6-8]) and

by choosing conditions that decrease the rate of self-assembly. Several factors are known to increase the speed of the process, such as solution pH or ionic strength [6]; in particular, selfassembly is extremely rapid when molecules carry almost zero net charge [7]. These findings were used to design our experimental conditions: peptides were dissolved in ultra-filtered water to minimize the presence of electrolytes; moreover the pH of the typical sample was approximately 3 (due to residual trifluoroacetic acid from synthesis): in these conditions molecules carry a slightly positive charge and self-assembly is slowed down dramatically (hours instead of seconds) [9].

26

The hydrogel matrix generated by -sheet peptides is very weak: only Van der Waals forces, hydrogen bonds and hydrophobic interactions hold it together [7]. In order to characterize single fibers in the matrix it is essential to minimize sample processing, which may disrupt fiber structure. This necessity dictated the choice of quick-freeze/deep-etch (QFDE) and atomic force microscopy (AFM) as the main investigation tools. In QFDE samples are brought to a temperature of -185'C very quickly to avoid the formation of water crystals: molecules are essentially vitrified and their native state in solution is preserved [8]; shadowing with platinum allows observation under TEM. AFM also requires very little sample preparation and the resolution achieved can be even higher than TEM (up to 1 A, depending on the imaging conditions and the type of sample); in the case of biomolecules a resolution of 0.5-1 nm is not unrealistic [10].

The kinetics of self-assembly were also monitored by circular dichroism (CD). This technique is based on the observation that biological molecules absorb left- and right-hand circularly polarized light differently and as a function of wavelength. CD spectra yield important information about secondary structure of polypeptide chains: in particular about their backbone conformation and, to some extent, their mutual orientation [11,12]. This technique was chosen because data can be collected in real time and non-invasively.

The peptide molecules studied in this research belong to the class of self-complementary

oligopeptides [13] and are characterized by the repeating pattern hydrophobic-hydrophilic in their amino acid sequence. Such peptides are also designed with a specific pattern in the charged side chains (+ - + -, ++ ++ -, etc.) which promotes the formation of highly stable matrices.

The molecule KFE8 (sequence FKFEFKFE, charge pattern + - + at neutral pH) was chosen as the starting point for this investigation, as TEM micrographs [5] and hydrogel rheological

27

measurements [7] were already available. After exploration of the structures formed by KFE8 in various conditions, experiments were performed to test the effects of two main variations on its sequence: a change in side chain (phenylalanine was substituted with tryptophan) and a change in modulo (the pattern + - + in the charged side chains was expanded to ++ ++ -); in the latter case, the sequence becomes FKFKFEFEFKFKFEFE and the molecule is named KFE8-II

(modulo II).

Peptide synthesis and sample preparation

The peptides KFE8, KFE8-II and KWE8, of sequence [COCH

3

]-FKFEFKFE-[CONH

2

],

[COCH

3

]-FKFKFEFEFKFKFEFE-[CONH

2

] and [COCH

3

]-WKWEWKWE-[CONH

2

] respectively (F is phenylalanine, W is tryptophan, K is lysine and E is glutamic acid), were custom-synthesized from Research Genetics, Inc. (Huntsville, AL, USA) and the lyophilized powder was stored at 4

0

C. Peptide solutions were prepared by adding ultra-filtered water to roughly 1 mg of peptide powder to bring the concentration to 1 mg/ml (0.86 mM). The mixture was then vortexed at high speed for about 1 min, after which the peptide was in solution; samples stored at room temperature showed no visible precipitate even after months. The pH of the

KFE8 samples at concentration 1 mg/mL was approximately 3.3.

Aliquots of 4-8 gL were removed from the peptide solution at various times after preparation and deposited onto a mica surface that had been freshly cleaved (by means of an adhesive tape) immediately before use. Each aliquot was left on the mica substrate for 10-30 s to allow peptide deposition. To remove loosely bound peptides and eventual debris, the surface was then rinsed with 50-100 RL of ultra-filtered water. In some cases the rinsing step was performed with water adjusted at pH ~ 3.3 (the same as the typical peptide solution) with HCl.

28

The mica surface with the adsorbed peptide was then left to dry naturally in air (usually for 2 min) and imaged immediately afterwards.

Some experiments were performed using substrates other than mica. In the case of

highly oriented pyrolytic graphite (HOPG), the uppermost stratum was removed several times by means of an adhesive tape until a smooth surface was observed. The procedure for sample deposition, rinsing, drying and imaging was the same as in the case of mica. Silicon substrates were also used. In this case the surface of a silicon crystal (as used in the semiconductor industry for microchip fabrication) was washed in a bath of methanol / HCl at 50% vol. The surface was then rinsed with ultra-filtered water and dried in air before sample deposition, which followed the same steps as in the case of mica and graphite.

In order to investigate self-assembly at low temperatures, experiments were also performed at 4 C. In this case samples were prepared in a cold room and all the equipment

(peptide powder, pipettes, vortexer, ultra-filtered water, mica, etc.) was allowed to equilibrate with the environment before use.

Initial experiments were performed on a peptide synthesized in our lab, using a standard peptide synthesizer (Ranin Instrument Co., Inc., Woburn, MA). This sample was later found to be impure (about of the molecules were missing an E group). Some results (self-assembly in solvents other than water) are presented for this sample, in which case it will be explicitly stated.

All other results are from peptides synthesized commercially, where sample purity was always verified by mass spectrometry.

Atomic force microscopy (AFM)

Micrographs were obtained by scanning the mica surface in air using an atomic force microscope (Multimode AFM, Digital Instruments, Santa Barbara, CA, USA) operating in

29

Tapping Mode. The traditional contact mode of AFM was not suitable for these delicate samples, as it would disrupt their structure completely. When imaging soft biopolymers with

AFM at high resolution, it is important to minimize the tip tapping force. Soft silicon cantilevers were chosen (FESP model) with spring constant of 1-5 N/m and tip radius of curvature of 5-10 nm. AFM scans were taken at 512x512 pixels resolution and produced topographic images of the samples, in which the brightness of features increases as a function of height. The driving frequency of the cantilever was set to the inflexion point of the frequency-amplitude curve to the left of its peak value. Typical scanning parameters were as follows: tapping frequency -70 kHz,

RMS amplitude before engage 1-1.2 V, integral and proportional gains 0.2-0.6 and 0.3-1 respectively, setpoint 0.8-1 V, scanning speed 1-2 Hz. To increase height resolution, after ensuring that the scanning area was fairly flat, the Z-limit of the instrument was decreased in steps: from 440 V to 220 V, 110 V until 55 V. Images were recorded after scanning stability was reached.

Quick-freeze / deep-etch (QFDE)

Aliquots of -5 gL were taken from peptide solution, deposited over a 3 mm gold sample holder and quick-frozen in liquid propane (approximately -1 85'C) by means of a plunge-freeze device (TFD 010, BAL-TEC, Balzers, Principality of Liechtenstein). The frozen droplets were stored in liquid nitrogen and transferred onto the stage (-1 80'C) of a freeze-fracture system

(CFE-60, Cressington Scientific Instruments, Cranberry, PA). The stage was warmed to -100

0

C and the sample surfaces were etched for 30 min by placing a cold metal block (-180"C) directly above the samples. The exposed structures were rotary-replicated with a layer of -1.5 nm of platinum and strengthened with a layer of -20 nm of carbon. The samples were then treated in a

30

5% sodium hypochloride bath, which removed the replica coatings. Replicas were picked up on microscope grids and viewed under a Philips-300 transmission electron microscope (Eindhoven,

The Netherlands).

Circular dichroism (CD)

The peptide solution was injected immediately after preparation into a quartz cuvette with a path length of 0.1 cm. CD spectra were recorded (AVIV CD spectrophotometer model 202,

Aviv Instruments, Lakewood, NJ) at several times between 1 and 200 min after preparation, in the wavelength range 190-240 nm. The wavelength step was 1 nm and the averaging time was

Is. Secondary structure fractions were deduced from the spectra using the software CDNN 2.1

[14].

Image analysis

AFM images were analyzed using the software provided with the instrument (Nanoscope

III software, version 4.42) to measure various dimensions of the fibers. The diameter of helical ribbons was computed from their pitch and pitch angle, instead of directly from image measurement, to avoid overestimation effects due to the finite tip size. A different strategy was used in the case of bands of several adjacent fibers (Figure 2-13): to minimize width overestimation, the fiber diameter was computed by subtracting the AFM tip diameter from the width of the band and dividing by the number of fibers. Fiber length distributions were obtained

by measuring only fibers whose length was entirely contained in the image.

31

Other methods

Various other investigation methods were attempted in order to elucidate the molecular structure of these fibers: they deserve mention here, even though they did not yield good results.

Nuclear magnetic resonance (NMR) was performed on a sample of KFE8 dissolved in

D

2

0 at I mg/mL (solubility was much lower than in aqueous solutions). The NMR bands appeared to be broadened, even immediately after dissolution. The likely explanation is that aggregates larger than -50 molecules form soon after dissolution, preventing the technique from working.

X-Ray diffraction was performed on the KFE8 peptide at various times after dissolution

(courtesy of Dr. Mark Spector of the Naval Research Laboratory, Washington, DC), but a diffraction signal was not observed. The main difficulty with this technique is obtaining fiber alignment.

Peptide fibers were deposited onto silicon wafers where trenches of about 100 nm diameter had been etched, in order to probe their bending stiffness with the tip of the AFM.

Unfortunately, obtaining suspended fibers did not seem to be possible: AFM images showed fibers crossing the trenches, but the suspended part of the fiber was not present. Most likely, they were broken during imaging due to their fragility.

2.3 Results

2.3.1 Structure of self-assembled fibers in aqueous solution

AFM images collected a few minutes after dissolution of the KFE8 peptide in water revealed structures appearing as left-handed helical ribbons of pitch 19.1 ± 1.2 nm (Figure 2-1).

32

Figure 2-1. AFM scan of the fibers formed by the peptide KFE8 in

1mg/mL aqueous solution at room temperature. The sample was allowed 8 min of self-assembling time, after which it was deposited on a mica substrate. The surface was then rinsed with ultra-filtered water and dried in air. The brightness of features increases as a function of height and the scale bar is 100 nm. Inset:

TEM micrograph from a sample prepared in the same conditions and observed using the QFDE technique (the helix diameter is ~ 7 nm).

Samples prepared in the same conditions and observed via QFDE revealed the presence of helical ribbons with the same structure, dimensions and chirality (Figure 2-1, inset) as the ones observed via AFM. Such structures were also observed after deposition of the solution onto graphite and silicon (data not shown). The diameter of these helices was 7.1 ± 1.1 nm (computed from pitch and pitch angle, assuming cylindrical geometry).

The average length of helical ribbons remained approximately constant at -90 nm until they disappeared after approximately 2 h.

33

Figure 2-2. AFM scan of the fibers formed by the peptide KFE8 in a lmg/mL aqueous solution at room temperature. The sample was allowed 35 min of self-assembling time, after which it was deposited on a mica substrate. The surface was then rinsed with ultra-filtered water and dried in air. The brightness of features increases as a function of height and the scale bar is 100 nm.

A second type of fibrillar structure, appearing as a continuous cylinder (whose presence could also be detected in the early stages of self-assembly: Figure 2-1) appeared with increasing frequency at later stages (Figure 2-2) and further assembled into bands of parallel filaments

(Figure 2-3).

The diameter of this second structure was more difficult to estimate, as width measurements from AFM topographic images suffer from overestimation due to finite tip size.

An approximate value for the diameter of these tubular structures is 8 nm.

34

.

..........

Figure 2-3. AFM scan of the fibers formed by the peptide KFE8 at room temperature in aqueous solution at 1mg/mL. The sample was allowed 2 h of self-assembling time, after which it was deposited on a mica substrate. The surface was then rinsed with ultra-filtered water and dried in air. The brightness of features increases as a function of height and the scale bar is 100 nm.

The background of the above AFM images did not appear to be a bare mica surface.

Instead, a rather uniform layer of much smaller structures (possibly monomers or oligomers) was observed, upon which fibers were deposited. The presence of this background layer is more noticeable in images captured at later stages of self-assembly, when the number of free monomers in solution is reduced (Figure 2-5, for example).

35

Figure 2-4. AFM scan of the fibers formed by the peptide KFE8 in aqueous solution at lmg/mL at room temperature. The sample was allowed 4 h of self-assembling time, after which it was deposited on a mica substrate. The surface was then rinsed with ultra-filtered water and dried in air. The brightness of features increases as a function of height and the scale bar is 100 nm.

At long times after preparation of the peptide solution, AFM images revealed the presence of finely interwoven bands of parallel fibers (Figure 2-4); this fibrous matrix was also observed days after sample preparation (Figure 2-5).

The scans showed in Figures 2-5 and 2-6 were obtained from samples for which the rinsing step was performed with water adjusted with HCl to pH 3.3 (the same as the typical peptide solution). The structure of single fibers did not seem to change, although the matrix appeared less dense.

36

In

.............

Figure 2-5. AFM scan of the fibers formed by the peptide KFE8 in aqueous solution at

1mg/mL at room temperature. The sample was observed, via deposition on a mica substrate, 30 h after preparation. The mica surface was rinsed, after sample deposition, with ultra-filtered water at pH

3.3 (adjusted with HCl). The scale bar is 100 nm.

In general, peptide fibers produced after very long assembly times did not usually distribute uniformly onto the substrate: some areas were found to be densely populated with fibers, while others remained almost clear.

37

Figure 2-6. AFM scan of the fibers formed by the peptide KFE8 in aqueous solution at lmg/mL at room temperature. The sample was observed via deposition on mica 30 h after preparation.

Circa 30 s after deposition the mica surface was rinsed with ultra-filtered water at pH 3.3 (adjusted with HCl). Scale bar is 200 nm.

In order to isolate single fibers for better observation, the peptide solution was diluted 10 times immediately before deposition on mica. Rinsing of the mica surface after deposition of the aliquot was not necessary in this case, as AFM scanning resulted in clear images. Fibers appeared as tightly wound helical ribbons, of pitch approximately 9 nm (Figure

2-7). Notice, in the upper left of Figure 2-7, how a broken filament revealed a small tape-like structure between the two fragments.

38

. --- -- ---- =: W

Figure 2-7. AFM scan of the fibers formed by the peptide KFE8 in aqueous solution at lmg/mL at room temperature. To isolate single fibers the sample was diluted 10 times immediately before deposition on mica. No rinsing of the surface was necessary in this case. The sample was allowed to self-assemble for 30 h after preparation. Scale bar is 100 nm.

2.3.2 Fiber length distribution

The AFM images appearing in Figures 2-1, 2-2, 2-3 and 2-6 were analyzed to evaluate the fiber length distribution of the two kinds of structures observed: helical ribbons and tubular fibers. Two main features are apparent from this analysis: (a) helical ribbons disappear gradually over time, while maintaining approximately the same average length (Figure 2-8) and (b) tubular fibers seem to remain short for some time (10-20 min), after which they grow quickly until reaching an equilibrium length of the order of microns (Figure 2-9).

39

25-

20

Number of

Filaments

15

10-

Helical Ribbons

120

' 0

Figure 2-8. Length distribution of helical ribbons formed by KFE8 in aqueous solution.

Histograms obtained from image analysis of fibers deposited onto mica.

At long times both helical ribbons and tubular fibers appear to have a bimodal distribution: this result could be due to breakage of filaments during transfer of the solution onto mica or during rinsing. For this reason, and due to non-homogeneous fiber deposition on the substrate, a reliable filament length distribution is difficult to obtain using AFM.

40

~rrr~~i

Tubular fibers filaments

3

2

8

7

6

0

8

Time [min]

3540600

120

1800 0

200

800

1000

1200

Length [nm]

Figure 2-9. Length distribution of tubular fibers formed by KFE8 in aqueous solution.

Histograms obtained from image analysis of fibers deposited onto mica.

2.3.3 Secondary structure formation

Self-assembly of KFE8 in aqueous solution was also monitored over time with CD. One minute after dissolving the peptide in water, the CD spectrum was characterized by a typical $- sheet profile. Spectra collected over time were compared with those from proteins of known structure using the software CDNN 2.1 [14]. Such comparison indicated a steady increase in antiparallel

P-sheet

structures and a concurrent decrease in the presence of random coils (Figure

2-10).

41

50 50

45

40

35

30 -

1~

I I 1 1

20 -

151

0 20 40 60 80 100 120

Time [min]

Anti-parallel f-sheet

Random coil

140 160 180 200

Figure 2-10. CD spectra from self-assembly of KFE8 in aqueous solution at room temperature were recorded in the wavelength range 190-240 nm. Secondary structure fractions were deduced

by comparing spectra with those from known proteins using the software CDNN 2.1 [14].

These data should be interpreted with care, as they result from comparing CD spectra from self-assembling short peptides (inter-molecular folding: different peptide molecules come together) to spectra of known proteins (intra-molecular folding: a single polypeptide chain folds onto itself to assume its native conformation). Moreover, such comparison is performed by looking for similarities in the shape of the CD spectrum, which yields a qualitative deduction.

With this proviso, such findings agree with what would be expected from a solution of free monomers that slowly assemble to form structured aggregates (anti-parallel P-sheets).

42

2.3.4 Self-assembly at low concentration

In order to test the effect of peptide concentration on self-assembled structures and kinetics, samples of KFE8 were prepared in aqueous solution at 0.01 mg/mL. At this concentration the pH is approximately 5 and the molecules carry almost zero net charge (the pK of glutamic acid is around 4). These results are therefore not directly comparable with the ones presented in the previous section: a valid comparison would involve pH correction, but that would also entail addition of electrolytes (see "Suggested Directions for Future Research").

Figure 2-11. AFM scan of the structures formed by the peptide KFE8 in a 0.01 mg/mL aqueous solution at room temperature. The sample was allowed 8 h of self-assembly time, after which it was deposited on a mica substrate; no rinsing was necessary. The brightness of features increases as a function of height and the scale bar is 100 nm. Small tapes are approximately 0.7 nm high.

43

-z 1.11 - _ - __.- .

..... -

_-*

Figure 2-12. AFM scan of the structures formed by the peptide KFE8 in a 0.01 mg/mL aqueous solution at room temperature. The sample was allowed 8 h of self-assembly time, after which it was deposited on a mica substrate; no rinsing was necessary. The brightness of features increases as a function of height and the scale bar is 100 nm. Small tapes are approximately 0.7 nm high.

Figures 2-11 and 2-12 report the results of these experiments. Two are the most striking features: (1) fibers are much thinner than at concentration 1 mg/mL (the height of these structures is compatible with tapes a single molecule in thickness) and (2) their distribution on the substrate seems to be non-random. Such tapes appear to be oriented on the substrate preferentially along three main directions, at 1200 from one another (Figure 2-12): this is possibly a signature of the underlying atomic lattice of mica (see discussion).

44

Figure 2-13. AFM scan of the structures formed by the peptide KFE8 in a

0.01 mg/mL aqueous solution at room temperature. The sample was allowed 12 days of self-assembly time, after which it was deposited on a mica substrate; no rinsing was necessary. The brightness of features increases as a function of height and the scale bar is 100 nm. Small tubules are approximately 1.6 nm high and 3 nm wide.

At long self-assembly times a collection of many short tapes was observed, approximately twice the height of those observed at initial times (Figure 2-13: also in this image the orientation of fibers seems non-random).

2.3.5 Effect of temperature

Self-assembly of KFE8 in aqueous solution at 4'C was monitored over time using the usual procedure, except that sample preparation was performed in a cold room after allowing thermal equilibration of all the equipment (peptide powder, pipettes, vortexer, water, mica, etc.).

45

At early self-assembly times (Figure 2-14), helical ribbons were observed, with similar geometry and dimensions as the ones formed at room temperature.

-- -- _ .. -_ :;;

Figure 2-14. AFM scan of the fibers formed by the peptide KFE8 in a

1mg/mL aqueous solution at 4'C. The sample was allowed 10 min of self-assembling time, after which it was deposited on a mica substrate. The surface was then rinsed with ultra-filtered water and dried in air. The brightness of features increases as a function of height and the scale bar is 100 nm.

After the sample was allowed to self-assemble for 28 h (Figure 2-15) a combination of helical ribbons and tubular fibers was observed, in similar proportion as seen at room temperature after approximately 1 h of incubation (Figure 2-2).

46

Figure 2-15. AFM scan of the fibers formed by the peptide KFE8 in a 1mg/mL aqueous solution at 4'C. The sample was allowed 28 h of self-assembling time, after which it was deposited on a mica substrate. The surface was then rinsed with ultra-filtered water and dried in air. The brightness of features increases as a function of height and the scale bar is 100 nm.

AFM images obtained at low temperature were not as clear as the ones captured at room temperature: amorphous structures appeared in concomitance with structured fibers. This was possibly due to the lower solubility of the peptide at low temperature. In conclusion, selfassembly of KFE8 at 4'C seemed to generate structures essentially identical to the ones formed at room temperature, the only difference being the time scale of their formation. A very approximate estimate of this time scale difference can be inferred by comparing the times required for the samples at to room temperature and 4

0

C, respectively, to reach a similar relative composition of helical ribbons and tubules. By using this rough criterion, we estimate that selfassembly at 4'C is about 30 times slower than at room temperature.

47

2.3.6 Self-assembly at neutral pH

Figure 2-16. AFM scan of fibers formed by the peptide KFE8 dissolved (at 1mg/mL concentration) in water adjusted to pH 10.3 with NaOH (so that mixing with peptide powder would yield a solution at pH 7.5) at room temperature. The sample was allowed 30 min of selfassembly time, after which it was deposited onto graphite (imaging on mica resulted in large fiber clumps that prevented imaging). The surface was then rinsed with 40 pL of ultra-filtered water and dried in air. The brightness of features increases as a function of height and the scale bar is

100 nm

At neutral pH molecules carry zero net charge. In such conditions self-assembly is expected to be extremely rapid [7]. Imaging these structures was challenging for two main reasons: (1) obtaining a peptide solution at pH 7 required extremely careful control of solvent pH

(an exact value of 10.30 was necessary) and (2) deposition of the samples on a substrate resulted in fiber tangles and other amorphous aggregates scattered on the surface. Moreover, observation

48

was possible only on graphite. As Figure 2-16 shows, at neutral pH mature fibers form very rapidly, compared to self-assembly at pH 3 (Figure 2-2).

2.3.7 Influence of synthesis method

The results described above were all obtained from peptides synthesized commercially; their purity was always verified by mass spectrometry. In the early stages of this research, on the other hand, peptides were synthesized in our lab. In particular, the first observation of helical ribbons was performed on one such samples. Interestingly, even though this sample was later found to be impure (approximately of the molecules were missing an E group), it gave rise to the same structures (geometry and dimensions) as the ones observed from the molecule synthesized commercially; for this reason, results from these experiments are reported in this section.

Helical ribbons were observed in the early stages of self-assembly (Figure 2-17) and tubular fibers were more prevalent at later stages. Interestingly, and unlike the case of samples synthesized commercially, helical ribbons seemed to reach lengths in the order of microns before disappearing (Figure 2-18).

49

Figure 2-17. Helical ribbons formed by the peptide KFE8 synthesized in our lab (see methods).

This image was obtained by deposition of the sample (1 mg/mL aqueous solution at room temperature) on mica 1 h after preparation, followed by rinsing and drying in air. Scale bar is 100 nm.

Notice also the difference in time scale of self-assembly for this sample (by comparing, for example, Figures 2-18 and 2-3). In general, a great variability in assembly time scales was observed in peptides synthesized using different methods (e.g. from different companies). The cause of this difference is most likely the presence of residual compounds from different synthesis processes. Since the rate of self-assembly is sensitive to the concentration of electrolytes in solution, different amounts of residual ionic compounds possibly explain such differences.

50

Figure 2-18. Structures formed by the peptide KFE8 synthesized in our lab (see methods) in a 1 mg/mL aqueous solution at room temperature. The image was obtained by deposition of the sample on mica 4 days after preparation, followed by rinsing and drying in air. Scale bar is 100 nm. The pitch of helical ribbons is approximately 20 nm.

Tubular fibers were also observed using the phase imaging mode of AFM (sensitive to local variations in sample stiffness), which allowed better appreciation of their nature (Figure 2-

19). The structures that seemed tubular fibers under height imaging AFM appeared as tightly wound coils (of pitch approximately 10 nm) when observed using phase imaging. Notice also that the left-handedness of such coils is preserved. The presence of a background layer is also clear from this image: in particular, areas of bare mica can be distinguished at long times, when the number of free monomers in solution is reduced.

51

Figure 2-19. Fibers formed by the peptide KFE8 synthesized in our lab (see methods) in aqueous solution at 1mg/mL at room temperature. The image was obtained by deposition of the sample on mica, followed by rinsing and drying in air. Scanning was performed in the phase imaging mode of the instrument. The sample was allowed to self-assemble for 12 days before deposition. Scale bar is 100 nm. The pitch of tightly wound coils is ~ 10 nm.

Experiments at low concentration were also performed on this sample and revealed the presence of thin tapes, along with more complex structures (Figure

2-20). The small tapes appeared similar to the ones captured in Figures 2-12 and 2-13 and seem to be distributed, once again, along three preferential directions on the surface.

52

Figure 2-20. Structures formed by the peptide KFE8 synthesized in our lab (see methods) in a

0.01 mg/mL aqueous solution at room temperature. The image was obtained by deposition of the sample on mica 8 days after preparation, followed by rinsing and drying in air. Scale bar is 100 nm. The small tapes are approximately 1.1 nm high, while the large tubules are approximately

5 nm high.

At 0.1 mg/mL (and pH ~ 4) larger tubular structures were observed, along with small tapes appearing in the background (Figure 2-21).

53

Figure 2-21. Structures formed by the peptide KFE8 synthesized in our lab (see methods) in a 0.1

mg/mL aqueous solution at room temperature. The image was obtained by deposition of the sample on mica 8 days after preparation, followed by rinsing and drying in air. Scale bar is 100 nm. The large tubules are approximately 5 nm high.

2.3.8 Effect of solvent composition

The unique properties of water are essential in determining the structure of polypeptide chains in aqueous environment [15]. It is therefore reasonable to expect the structures formed by self-assembling peptides to be influenced by the nature of the solvent.

The goal of the experiments presented in this section was to establish the extent to which solvent composition affects the formed structures. In particular, ethanol and methanol are thought to interfere with the hydrophobic effect (the main driving force for protein folding [16]) by disrupting the network of hydrogen bonds present in water.

54

Figure 2-22. Fibers formed by the peptide KFE8 synthesized in our lab (see methods) at lmg/mL at room temperature in a solution of 50% water/ethanol. The image was obtained by deposition of the sample on mica, followed by rinsing (with same solvent) and drying in air. The sample was allowed to self-assemble for approximately

5 hr before deposition. Scale bar is 100 nm.

Self-assembly of KFE8 (molecule synthesized in the lab) was monitored in solutions of water/methanol and water/ethanol (pure methanol or ethanol did not dissolve this peptide). Selfassembly in mixtures of water with 10% of either solvent did not change the formed structures significantly (data not shown). Self-assembly in 50% mixtures, on the other hand, gave rise to structures that were different from the ones assembled in aqueous solution. The main effect appeared to be a tendency of the helical ribbons to be structurally less well-defined (Figure 2-

22).

55

Figure 2-23. Fibers formed by the peptide KFE8 synthesized in our lab (see methods) at lmg/mL at room temperature in a solution of 50% water/ethanol. The image was obtained by deposition of the sample on mica, followed by rinsing (with same solvent) and drying in air. The sample was allowed to self-assemble for approximately 5 hr before deposition. Scale bar is 100 nm.

This is especially apparent in Figure 2-23, where helical structures do not seem to have constant pitch along their axis (see for example the filament spanning the image vertically, on the right). Notice also in this image how two small tape-like structures arise from the end of a tubular filament. Similar considerations apply to the fibrillar structures in Figure

2-24, formed in a mixture of 50% water/methanol. It is interesting to note in this image how a helical ribbon structure is disrupted and continues as a small tape (upper left).

56

Figure 2-24. Fibers formed by the peptide KFE8 synthesized in our lab (see methods) at

1mg/mL at room temperature in a solution of 50% water/methanol. The image was obtained by deposition of the sample on mica, followed by rinsing (with same solvent) and drying in air. The sample was allowed to self-assemble for 4 hr before deposition. Scale bar is 100 nm.

2.3.9 Deposition onto graphite

Observation of the fibers formed by KFE8 after deposition on graphite revealed essentially the same structures as observed on mica. Nevertheless, a fewer number of fibers generally adhered to the graphite surface. Unlike mica, which is negatively charged when in contact with water (therefore attracting the positively charged fibers), graphite is neutral and hydrophobic. It is therefore likely that only Van der Waals forces act between fibers and substrate in this case.

57

Figure 2-25. AFM scan of the fibers formed by the peptide KFE8 in a 1mg/mL aqueous solution at room temperature. The sample was allowed 80 min of self-assembling time, after which it was

deposited onto graphite. The surface was then rinsed with ultra-filtered water and dried in air.

The brightness of features increases as a function of height and the scale bar is 100 nm. Filament diameter is approximately 7.1 nm.

Fibers deposited onto graphite after long self-assembly times showed a marked tendency to align into bands of parallel filaments (Figure 2-25). This situation allowed more accurate measurement of fiber diameter, directly from the AFM image: by measuring the width of an entire band and dividing by the number of filaments, one can minimize the overestimation due to finite tip size that arises when measuring isolated fibers. The fiber diameter calculated in this way was approximately 7.1 nm.

58

2.3.10 Effect of side chain variation

Peptide self-assembly involves different molecules (short polypeptide chains) coming together to form ordered aggregates. This process is similar to protein folding, where different segments of the same polypeptide chain come together to form a compact structure. Since the dominant force driving protein folding is the hydrophobic effect [16], it is very likely that such effect be of central importance also for the self-assembly of our peptides. To investigate this phenomenon the phenyl side chains in KFE8 were mutated to tryptophan (Figure 2-26), which should result in a larger hydrophobic effect (2.9 kcal/mol, compared to 2.3 kcal/mol for phenylalanine [17]). The new molecule was named KWE8.

F W

Figure 2-26. Molecular model of the amino acids phenylalanine (F) and tryptophan (W). Carbon is light blue, Nitrogen is blue, Oxygen is red and Hydrogen is white.

59

Figure 2-27. AFM scan of the fibers formed by the peptide KWE8 in a 1mg/mL aqueous solution at room temperature. The sample was allowed 30 min of self-assembling time, after which it was deposited on a mica substrate. The surface was then rinsed with ultra-filtered water and dried in air. The brightness of features increases as a function of height and the scale bar is

100 nm

Figures 2-27 and 2-28 show the structures formed by KWE8 in the early stages of selfassembly: this molecule did not form helical ribbons; instead, small tapes approximately 8 nm wide and 2 nm thick seemed to be the only structures present. The extremities of those tapes also showed a tendency to fold back into circular structures.

60

UUaJUULL.JU~UJ L.~.jIUb.ju EU I

Figure 2-28. AFM scan of the fibers formed by the peptide KWE8 in a 1mg/mL aqueous solution at room temperature. The sample was allowed 30 min of self-assembling time, after which it was deposited on a mica substrate. The surface was then rinsed with ultra-filtered water and dried in air. The brightness of features increases as a function of height and the scale bar is

100 nm. The small tapes are approximately 8 nm wide and 2 nm high.

It is interesting to compare Figures 2-28 and 2-3: after approximately the same selfassembly time, KFE8 forms a population of few very long fibers, while KWE8 forms very many short fibers.

61

Figure 2-29. AFM scan of the fibers formed by the peptide KWE8 in a

1mg/mL aqueous solution at room temperature. The sample was allowed 3 days of self-assembling time, after which it was diluted 10 times before deposition on mica. This sample did not require rinsing and was dried in air. The brightness of features increases as a function of height and the scale bar is

100 nm. The small tapes are approximately 8 nm wide and I nm high.

After days in solution at room temperature, some of the tapes reached lengths in the order of microns; such long tapes also showed a tendency to align into bands of parallel filaments. The small circular structures did not disappear at long times (Figures 2-29 and 2-30).

62

E

Figure 2-30. AFM scan of the fibers formed by the peptide KWE8 in a

1mg/mL aqueous solution. The sample was allowed 10 days of self-assembling time, after which it was deposited on a mica substrate. The surface was then rinsed with ultra-filtered water adjusted at pH 3 and dried in air. The brightness of features increases as a function of height and the scale bar is 100 nm. The small tapes are approximately 8 nm wide and

2.5 nm high.

2.3.11 Variation in molecular pattern

The pattern of charged side chains in the sequence of self-assembling peptides is thought to promote the formation of a stable matrix [13,18]. The effect of a simple variation in this pattern was investigated using a molecule (KFE8-II) in which each charge was repeated twice with respect to KFE8 (++ ++

-- instead of + + -); such variation also implies doubling of chain length. Figure 2-31 shows the striking difference in self-assembling behavior of this molecule with respect to the others studied: KFE8-II does not form tapes or ribbons; instead, small circular structures are formed.

63

Figure 2-31. AFM scan of the fibers formed by the peptide of sequence FKFKFEFEFKFKFEFE in a 1mg/mL aqueous solution at room temperature. The sample was allowed 10 min of selfassembling time, after which it was deposited on a mica substrate. The surface was then rinsed with ultra-filtered water and dried in air. The brightness of features increases as a function of height and the scale bar is 100 nm. The diameter of the small rings is circa 15 nm and their height is approximately 0.5 nm.

Such circular structures appeared to align into filament-like arrays at later stages (Figure

2-32). It is interesting to note that circular structures similar to these ones have been observed in the aggregation of y-crystallin [19] (the proteins whose aggregation in the eye leads to cataract) and i-synuclein [20] (associated with Parkinson's disease).

64

Figure 2-32. AFM scan of the fibers formed by the peptide of sequence FKFKFEFEFKFKFEFE in a 1 mg/mL aqueous solution at room temperature. The sample was allowed

1 h of self-assembly time, after which it was deposited on a mica substrate. The surface was then rinsed with ultrafiltered water and dried in air. The brightness of features increases as a function of height and the scale bar is 100 nm.

2.4 Discussion

The fundamental goal of this experimental section was the elucidation of the molecular basis for peptide self-assembly. The experiments described above were performed with the intention of finding a relation between molecular structure and self-assembling behavior. The main finding was that short polypeptide chains can self-organize into diverse and coexisting structures, depending on time and assembly conditions. Moreover, a small change in the molecules resulted in markedly different supramolecular structures.

65

2.4.1 Structures formed by KFE8

The most interesting result from the investigation of the KFE8 molecule is the complexity of the path followed to produce mature fibers. The prevalence of helical ribbons in the early stages of self-assembly and their subsequent disappearance suggest they are intermediates in the formation of tubular fibers. The techniques used to characterize them did not require harsh treatments: other than sample deposition onto a substrate, AFM requires almost no preparation and QFDE is supposed to preserve the native structure of biomolecules in solution. Since identical structures were observed with both methods, we believe that such helical ribbons are formed in solution and that peptide-substrate interactions do not drastically alter their structure. If helical ribbons are indeed the precursors of mature fibers, they are likely to influence the outcome of the whole self-assembly process: for this reason much work was dedicated to elucidating their molecular structure.

Our CD measurements indicated that KFE8 molecules organize in solution into 1-sheets.

A polypeptide chain is said to be in -strand conformation [21] when each side chain is rotated approximately 1800 from the previous one (see for example the KFE8 molecule in Figure 1-2).

1-strands can then form hydrogen bonds sideways with other polypeptide chains in the same conformation, giving rise to a structure called 1-sheet. A P-sheet is therefore a tape-like structure held together by hydrogen bonds, in which the side chains point outwards from the two sides of the tape. For reasons that are not completely clarified, whenever a polypeptide chain assumes a 1-strand conformation, it is also obliged to assume a right-hand twist along its backbone [22]. Each molecule in the 1-sheet tape can therefore be pictured as a little brick characterized by a right-had twist along its main axis. Pairing of the bricks to form a $-sheet would therefore result in an overall left-hand twist of this structure (Figure 2-33, left).

66

Ato~bII-

WV

Figure 2-33. Model for the packing of molecules in a helical ribbon. Each KFE8 peptide is represented as an amphiphilic brick (orange denotes hydrophilic side chains and green denotes hydrophobic ones). Each brick possesses a right-hand twist along its main axis, which leads to an overall lef-hand twist of the formed

P-sheet

tape. Bending of the twisted tape would result in a left-handed helical ribbon.

A left-handed helical ribbon might be produced by bending of the -sheet tape, driven for example by the hydrophobic effect associated with the phenyl side chains. These considerations formed the basis for the first model of molecular arrangement within these helical ribbons. Such simple model was the starting point for later refinements, as described in the following section.

2.4.2 Molecular dynamics simulations for KFE8

In order to investigate the detailed molecular architecture of the helical ribbons formed

by KFE8 we pursued the avenue of molecular dynamics simulations. All possible molecular packing alternatives compatible with the dimensions of these ribbons were explored and the most stable was selected as the most likely molecular structure. The results shown in this section were obtained by postdoctoral associate Wonmuk Hwang and are included here to give the reader a complete picture.

Preliminary simulations showed that the parallel 1-sheet conformation is less stable than the antiparallel, in agreement with our CD results. Thus we focused on antiparallel $-sheets which, due to the asymmetric distribution of backbone hydrogens and oxygens of KFE8, limit the number of hydrogen bonding patterns with adjacent molecules to two on each side

[23].

Therefore, a total of four different antiparallel $-sheet conformations is possible. Using all

67

--I

combinations of these four types of sheets, we constructed single and double helical ribbons of

20 nm pitch and 7 nm diameter. Molecular dynamics simulations were then performed on these structures to test their stability in water by checking the distortion of their geometry and the preservation of -sheet content. Single-sheet helices were found to be unstable, collapsing immediately, because the hydrophobic side of the 1-sheet was still exposed to the solvent.

Double helical 1-sheets, with the hydrophobic side chains sandwiched between the two layers

(Figure 2-34), were found to be the most stable structures. Interestingly, this model implies that the number of molecules per helical tum in the inner sheet is less than in the outer one: a case of symmetry breaking. Another interesting consideration concerns the phenyl side chains, which are sandwiched in between the two layers. Quantum chemistry calculations [24] showed that phenyl rings in close proximity assume specific mutual orientations (sandwich, sheared sandwich or T) to minimize their interaction energy. Such interaction is characterized by approximately half the energy of a hydrogen bond. Even though this effect could not be captured by the simulations performed so far, we believe it might be important in the formation and stability of these helical ribbons [25,26].

These simulations allowed to propose a very detailed model of molecular packing for helical ribbons. Nevertheless, this model does not yield any information about the kinetics of fiber formation. In particular, the two layers cannot possibly form independently and join later: they have to grow from a few monomers. We will expand on the kinetics of fiber formation in the next chapter.

68

Inner helix Outer helix Double helix

AFM image

Axis view

Figure 2-34. Proposed molecular structure of a helical ribbon, as obtained from molecular dynamics simulations [23]. According to this model, each helical ribbon is comprised of two lefthanded helical layers: an inner helix and an outer helix. Both helical structures are P-sheet tapes.

2.4.3 Transition from ribbons to tubules

Helical ribbons disappear during self-assembly of KFE8 and the final stages are characterized only by tubular fibers. The nature of this transition is not yet clear, but a few possible explanations can be offered. The first possibility is that helical ribbons are

69

intermediates in the sense that they are first created and then dismantled, while tubular fibers are a more stable state of molecular packing. A second possibility is that helical ribbons collapse to a more tightly wound state, resulting in tubular fibers. This hypothesis is corroborated by the decreased helical pitch of tubular fibers observed in Figure 2-19. Interestingly, a similar phenomenon has been observed at a larger length scale in lipids [27] and chlorophenols [28]. A third possible explanation is that a helical geometry may facilitate the addition of monomers in the gap between its coils, resulting in a supercoiled geometry. The observed tighter pitch in the images mentioned above would then be due to the presence of two coils in the same structure.

Images from self-assembly in ethanol, which disrupts the structure of the fibers, showed two small tape-like structures emerging from each tubular fiber, possibly in agreement with this hypothesis. The height of such small tapes could not be determined with sufficient precision to determine whether they are single- or double-layers. Much work remains to be done to elucidate this phenomenon. As a final comment, it is interesting to note how helical ribbons produced by the KFE8 molecule synthesized commercially reach a maximum length of approximately 100 nm, while the ones from the sample synthesized by the author can be much longer; at present, we do not have a convincing explanation for this phenomenon.

2.4.4 Concentration and temperature dependence

KFE8 molecules appear to behave very differently depending on concentration. Linear polymers are still formed at 0.01 mg/mL, but they are characterized by much simpler geometry

(Figures 2-12 and 2-13) with respect to the structures formed at higher concentration. The orientation of these fibers on the substrate does not appear to be random in this case: three directions, at 1200 from one another, seem to be preferred. Moreover, the dimensions of the fibers are compatible with tapes a single molecule in thickness. Similar peptides have been

70

shown to form 1-sheet tapes a single molecule in thickness [29], but this would be surprising for

KFE8, as one side of such a $-sheet tape would be completely hydrophobic and could not form in aqueous environment. For these reasons and unlike our conclusion regarding helical ribbons

we cannot exclude the role of the substrate in the formation of such tapes. In particular, the mica substrate might help hold the molecules in a fixed conformation, thus favoring the subsequent addition of other monomers to form (-sheets. The atomic lattice of mica may also favor their growth along preferred directions. The growth of P-sheet tapes on a two-dimensional substrate has also been observed for the amyloid P-peptide [30].

Self-assembly slows down at low temperature. Most likely, this is due to the temperature dependence of the hydrophobic effect: this is a very complex phenomenon, but it can be described qualitatively as follows. The energetic cost of dissolving a non-polar molecule in water at room temperature is due to ordering of water molecules around the solute (an entropic cost) [16]. The tendency of water molecules to form a strong hydrogen-bond network forces non-polar solutes, which cannot participate, to be secluded and packed together, resulting in an apparent attraction force. When the temperature decreases, water molecules explore a lower number of conformations, resulting in a decreased ability to seclude non-polar solutes. The force driving hydrophobic moieties together is therefore reduced and self-assembly is slower.

2.4.5 Molecular structure variation

The hydrophobic effect is the main driving force for protein folding (and appears to be also responsible for the uniqueness of the structure encoded by a given amino acid sequence)

[16]. As explained earlier, the same effect is likely to be responsible for bringing peptides together to form the fibers investigated here. The extent to which hydrophobic side chains

71

influence supramolecular structure was investigated by substituting phenylalanine with tryptophan in KFE8, which resulted in very different fibrillar structures: in particular, KWE8 did not form helical ribbons. We do not have an explanation for this phenomenon, except that perhaps specific interactions between phenyl rings (discussed earlier) would be very different for tryptophan. The kinetics of fiber formation also appeared to be quite different: if one compares

Figures 2-2 and 2-28, acquired after approximately the same assembly time in the same conditions, it is apparent that KWE8 forms a collection of many short fibers, while KFE8 forms fewer and longer fibers. We will return later on this result in the next chapter, after analyzing the kinetics of fiber formation in a quantitative framework.

Changing the modulo of KFE8 (Figure 2-31) resulted in dramatically different structures:

KFE8-II formed rings instead of linear fibers. We have not conducted a rigorous analysis of these structures and cannot offer a possible molecular model. Nonetheless, one may note that it is increasingly difficulty to form a J-sheet when strands are longer than 7-8 amino acids [31], so fiber formation is hindered.

2.4.6 Network formation

The growth of n-sheet fibers in solution leads to their entanglement and to the formation of a network; understanding this process is essential for the design of biomaterials. AFM is unfortunately not the most suitable tool for investigating this aspect of the process: images can be captured only after deposition of a three-dimensional structure onto a two-dimensional substrate, which may result in fiber breakage and network disruption. Nevertheless, an interesting aspect of network formation could be characterized: fibers (from both KFE8 and

KWE8) tend to align into bands of parallel filaments. Such bands, and not isolated filaments,

72

appear to be the network's building blocks. It is interesting to note that such fibers are positively charged: recent theoretical results on band formation from like-charged rods [32] might help interpret this finding.

2.4.7 Considerations from other areas of research

One of the most interesting features of 3-sheet peptides is their ability to self-organize: starting from an homogeneous solution, these molecules assemble into long fibers that form a network. This behavior is not restricted to the class of peptides investigated in this work: fiber formation is centrally important to biology, both in physiological processes (e.g. actin and tubulin polymerization) and pathological ones (e.g. Alzheimer's disease). The mechanism by which certain amino acid sequences favor f-sheet fibrillar aggregation is not clearly understood.

Recent findings show that even typically globular, soluble proteins can be induced to form fibers under appropriate conditions [33]. This phenomenon was often overlooked by researchers in molecular biology, who simply were frustrated by the aggregation of their samples and avoided such conditions. It was only in the 60's that this phenomenon started to be investigated per se

[34-36]. In recent years it was also recognized that amyloid deposits are a hallmark of many diseases other then Alzheimer's (e.g.: Creutzfeld-Jakob disease, Parkinson's disease and type II diabetes).

In the case of Alzheimer's disease a large body of evidence [37-40] points to a 42 amino acid peptide (called amyloid f-peptide, or A$1-42) as the pathogenic agent. This peptide selfassembles into toxic fibers that accumulate in the brain. The reason why such fibers are toxic is not completely clear. According to Ingram [40], amyloid P-sheet fibers bind to membrane channels (neurotransmitter-gated ion channels called AMPA) that regulate calcium influx in

73

neurons. The AP 1-42 peptides, only in aggregated form, are able to open these channels permanently, leading to a continuous influx of calcium that disrupts cellular machinery and eventually leads to neuronal death. A second hypothesis has been advanced by Lashuel et al.

[20], who found that the early stages of peptide aggregation are characterized by pore-shaped structures which may fatally puncture the membrane of neurons. Therefore, it is possible that the amyloid plaques found in the brain of Alzheimer's patients may be a stable and less toxic state of the AP peptide.

From a medical perspective it is important to characterize the physical and chemical conditions promoting fiber formation. Regardless of whether the final fibers are the toxic agent or the inert species, knowledge of the process will help in devising therapies that minimize dangerous conditions.

From an engineering perspective, the ability of molecules to form well-organized structures at the nanometer scale represents a compelling route to the fabrication of nanostructured materials and devices [41]. For the peptides investigated in this work, applications would most likely arise at the interface between biology and engineering. For example, Reches and Gazit [1] have recently harnessed the self-organizing capacity of certain

J-

sheet peptides to produce metal nanowires. Stupp et al. showed that amphiphilic peptides can direct the growth of crystals and organize the formation of bone-like tissue [42-45]. In order to fully exploit the potential of these molecules it is important to characterize the structures produced by self-assembly and to relate them to properties of the building blocks and to selfassembly conditions.

The determinants and mechanism of

P-sheet

formation are not clearly understood. The prevailing school of thought describes this phenomenon as being driven by hydrogen bonding

74

between the atoms in the backbone of a protein [46,47]; any polypeptide chain would then form n-sheet fibers unless it is kept in a carefully controlled environment that prevents its unfolding from the globular state. Other researchers assign a key role to side chains containing phenyl rings, such as phenylalanine and tryptophan [48]; according to this view, the stacking of aromatic rings plays a key role in the process of intermolecular recognition and self-assembly leading to the formation of fibers. In this regard, it is interesting to note that the hydrophobic side chains of the peptide KFE8 investigated in this research are all phenyl rings.

Despite the lack of a detailed understanding of f-sheet fiber formation, a recurrent theme found in the literature is the concept of nucleation and growth on a folded template [49-62]

(Figure 2-35). According to this model, once the first nucleus of a P-sheet is formed, it acts as a template upon which other monomers can easily add. Therefore, the rate-limiting step of the process would be the nucleation of the first P-sheet segment. Subsequent growth of the fiber by monomer addition would be much faster. Fiber formation is therefore characterized in this model by two kinetic parameters: the nucleation rate and the growth rate. The former is central to this process: if nucleation of f-sheet seeds is a rare event, compared to monomer addition, the final configuration will comprise few, very long fibers; otherwise, if the two rates are comparable, a large population of short fibers would be produced. As a consequence, terms like

"promotion" or "prevention" of fiber formation are misleading, as fibers could be very few in number but very long: the relevant concept is fiber length distribution [51]. This qualitative model will form the basis of a more quantitative understanding of -sheet formation, which will be developed in the next chapter.

75

-ma

'"VV

JjfV.

Figure 2-35. Possible mechanism of amyloid fibril formation. A partially unfolded form of a protein self-associates through $-sheet domains (zig-zag pattern) to initiate fibril formation. This intermediate provides the template for further deposition of protein and for the development of the stable, mainly P-sheet, core structure of the fibril. The undefined regions in the final structure represent the possibility that not all of the primary chain is involved in the cross-p structure (from reference 41).

2.5

Suggested directions for future research

The most important result from the experiments reported above is probably the characterization of the intermediate structures (helical ribbons) in the self-assembly of KFE8.

The fundamental step that led to this observation was the decision to investigate a confined aspect of the whole process (single fiber formation) in the simplest possible conditions: unlike previous experiments, where ionic strength was always used to induce matrix formation, solutions were prepared here without addition of electrolytes. Using the same reasoning, i.e.

76

confining one's attention to a subset of the whole problem and investigating in a clean setting, some directions for future research can be suggested.

The structure of fibers self-assembled from --sheet peptides is likely to be very complex: in the case of KFE8 we have seen that they might result from collapsing helical ribbons or supercoiling of tapes. On the other hand, we have also seen that at low concentration (Figures 2-

11, 2-12, 2-20) the structure of such fibers seems to be much simpler. In particular, it seemed possible to produce tapes a single molecule in thickness. Even though such tapes might be the result of interaction with the substrate (the mica lattice may act as a catalyst in their nucleation), this setting is nevertheless a very clean setup to study the nucleation and growth of molecular tapes. Due to the simplicity of these structures, their molecular architecture would also be easier to deduce. One could investigate, for example, the effect of contact time (of the peptide solution on mica) on tape length distribution. Repeated deposition of peptide solution onto the same mica substrate, each followed by AFM imaging and analysis of fiber length distribution, would elucidate whether tapes are indeed nucleated and grown on the substrate, rather than in solution.

This setting would allow to study the effect of molecular features, such as specific side chains or overall length, on nucleation and elongation rates. For example: what features cause tapes to nucleate? Is it possible to stop their growth? What is the influence of the electrolyte concentration on nucleation rate? One could also study the effect of nucleation sites (predeposited impurities) on the location of such tapes. The effect of an imposed electric field during self-assembly would also be very interesting to investigate. Controlling the nucleation, growth and spatial distribution of molecular tapes on a surface would be of enormous interest to the microelectronics industry, especially because metal nanocrystals can be bound to peptide

77

molecules [63]. Such study would also take advantage of results already obtained from selfassembly of simpler molecules in two dimensions [64].

Network formation is another very important aspect of this research. Understanding how fibers intertwine to form the bulk material is of central importance for developing tissue engineering applications. A different set of investigation techniques is required for this purpose.

Attaching fluorescent moieties to peptides, for example, will allow visualization of the growing network. Another possibility is to introduce Brownian particles in the self-assembling solution and use them as probes (by monitoring their thermal fluctuations) of the geometrical and mechanical properties of such networks.

2.6 Conclusion

Unlike colloids, that coalesce into amorphous (fractal) aggregates upon addition of electrolytes [65], self-complementary peptides tend to form linear structures. As demonstrated above, this self-assembly process can be complex (as in the case of KFE8), but generally leads to the formation of fibers held together by non-covalent bonds. Fiber formation is central to a wide variety of biological processes, which usually involve complex protein monomers (e.g.: assembly of microtubules from tubulin). Self-assembling peptides, on the other hand, are very short polypeptide chains and yet can form similar fibers and networks. Due to their simplicity, elucidating the basis of their behavior should be within the reach of current experimental and theoretical tools. We believe these molecules are excellent model systems for biological selfassembly and that understanding the molecular basis of their behavior is a reasonable objective.

The experiments described above were the first step towards this goal.

78

This chapter presented the main findings of investigations aimed at elucidating the molecular architecture of fibers self-assembled from the molecules KFE8, KFE8-II and KWE8 in various conditions. Results revealed that fiber self-assembly is a complex process, in which different structures coexist at the same time and mature fibers are usually produced through intermediate steps. Such findings were recast in the general framework of P-sheet fiber formation, which is of central importance to a wide variety of pathological processes. f-sheet formation is described in the literature as a nucleation-and-growth process, where the first nucleus acts as a template for further deposition of monomers, resulting in fiber elongation. In order to infer predictive tools for the design of biomaterials, in the next chapter we will recast this qualitative model in a quantitative framework.

79

References for Chapter 2

[1] Reches, M. and Gazit, E. Casting metal nanowires within discrete self-assembled peptide nanotubes. Science 300, 625-627 (2003)

[2] Whaley, S. R., English, D. S., Hu, E. L., Barbara, P. F. and Belcher, A. M.

Selection of peptides with semiconductor binding specificity for directed nanocrystal assembly. Nature 405, 665 (2000).

[3] Nowak, A. P., Breedveld, V., Pakstis, L., Ozbas, B., Pine, D. J., Pochan, D. and

Deming, T. J. Rapidly recovering hydrogel scaffolds from self-assembling diblock copolypeptide amphiphiles. Nature 417, 424 (2002).

[4] Balbirnie, M., Grothe, R. and Eisenberg, D. S. An amyloid-forming peptide from the yeast prion Sup35 reveals a dehydrated 1-sheet structure for amyloid. Proc.

NatL. Acad. Sci. USA 98, 2375 (2001).

[5] Caplan, M. R. PhD Thesis, Massachusetts Institute of Technology (2001)

[6] Holmes, T. C., de Lacalle, S., Su, X., Liu, G., Rich, A. and Zhang, S. Extensive neurite outgrowth and active synapse formation on self-assembling peptide scaffolds. Proc. Nati. Acad. Sci. USA 97, 6728 (2000).

[7] Caplan, M. R., Moore, P.N., Zhang, S., Kamm, R. D. & Lauffenburger, D. A.

Self-assembly of a 1-sheet protein is governed by relief of electrostatic repulsion relative to van der Waals attraction. Biomacromolecules 1, 627 (2000).

[8] Kisiday, J., Jin, M., Kurz, B., Hung, H., Semino, C., Zhang, S. and Grodzinsky, A.

J. Self-assembling peptide hydrogel fosters chondrocyte extracellular matrix production and cell division: implications for cartilage tissue repair. Proc. NatL.

Acad. Sci. USA 99, 9996 (2002).

[9] Marini, D. M., Hwang, W., Lauffenburger, D. A., Zhang, S. & Kamm, R. D. Lefthanded helical ribbon intermediates in the self-assembly of a 1-sheet peptide.

Nano Letters 2, 295-299 (2002).

[10] Chambrelain, A. K., MacPhee, C. E., Zurdo, J., Morozova-Roche, L. A., Hill, 0.,

Dobson, C. M. and Davis, J. J. Ultrastructural organization of amyloid fibrils by atomic force microscopy. Biophys. J. 79, 3282 (2000).

[11] Greenfield, N. and Fasman, G. D. Computed circular cichroism spectra for the evaluation of protein conformation. Biochemistry 8, 4108 (1996).

[12] P.K. Sarkar & P. Doty, The optical rotatory properties of the 1-configuration in polypeptides and proteins, Proc. Natl. Acad. Sci. USA 55, 981 (1966).

80

[13] Zhang, S., Holmes, T., Lockshin, C. and Rich, A. Spontaneous assembly of a selfcomplementary oligopeptide to form a stable microscopic membrane. Proc. Natl.

Acad. Sci. USA 90, 3334 (1993).

[14] Bohm, G., Muhr, R. and Jaenicke, R. Quantitative analysis of protein far UV circular dichroism spectra by neural networks. Protein Eng 5, 191-195 (1992).

[15] Robertson, W. H., Diken, E. G. and Johnson, M. A. Snapshots of water at work.

Science 301, 320 (2003).

[16] Dill, K. A. Dominant forces in protein folding. Biochemistry 29, 7133 (1990).

[17] Karplus, P. A. Hydrophobicity regained. Protein Science 6, 1302 (1997).

[18] Xiong, H., Buckwalter, B. L., Shieh, H.-M. and Hecht, M. H. Periodicity of polar and nonpolar amino acids is the major determinant of secondary structure in selfassembling oligomeric peptides. Proc. Natl. Acad. Sci. USA 92, 6349 (1995).

[19] Kosinski-Collins, M. S. and King, J. A. In vitro unfolding, refolding and polymerization of human yD crystalline, a protein involved in cataract formation.

Protein Science 12, 480 (2003).

[20] Lashuel, H. A., Hartley, D., Petre, B. M., Walz, T. and Lansbury, P. T. Jr.

Amyloid pores from pathogenic mutations. Nature 418, 291 (2002)

[21] Pauling, L. and Corey, R. B. Configurations of polypeptide chains with favored orientations around single bonds: two new pleated sheets. Proc. Natl. Acad. Sci.

USA 37, 729 (1951).

[22] Shamovsky, I. L., Ross, G. M. & Riopelle, R. J. Theoretical studies on the origin of n-sheet twisting. J. Phys. Chem. B 104, 11296 (2000).

[23] Hwang, W., Marini, D. M., Zhang, S. and Kamm, R. D. Supramolecular structure of helical ribbons self-assembled from a $-sheet peptide. J. Chem. Phys. 118, 389

(2003).

[24] Final project for the course Computational quantum mechanics of molecules and

extended systems (Prof. B. Trout), taken by the author at MIT. Results were in line with similar research found in the literature.

[25] Aravinda, S., Shamala, N., Das, C., Sriranjini, A., Karle, I. L. and Balaram, P.

Aromatic-aromatic interactions in crystal structures of helical peptide scaffolds containing projecting phenylalanine residues. J. Am. Chem. Soc. 125, 5308

(2003).

[26] Chelli, R., Gervasio, F. L., Procacci P and Schettino, V. Stacking and T-shape competition in aromatic-aromatic amino acid interactions. J. Am. Chem. Soc. 124,

6133 (2002).

81

[27] Schnur, J. M., Ratna, B. R., Selinger, J. V., Singh, A., Jyothi, G. & Easwaran, K.

R. K. Diacetylenic lipid tubules: experimental evidence for a chiral molecular architecture. Science 264, 945-947 (1994).

[28] Rogalska, E., Rogalski, M., Gulik-Krzywicki, T., Gulik, A. & Chipot, C. Selfassembly of chlorophenols in water. Proc. Natl. Acad. Sci. USA 96, 6577-6580

(1999).

[29] Aggeli, A., Bell, M., Boden, N., Keen, J. N., Knowles, P. F., McLeish, T. C. B.,

Pitkealthy, M. & Radford, S.E. Responsive gels formed by the spontaneous selfassembly of peptides into polymeric

P-sheet

tapes. Nature 386, 259 (1997).

[30] Kowalewski, T. & Holtzman, D. M. In situ atomic force microscopy study of

Alzheimer's 1-amyloid peptide on different substrates: new insights into mechanism of -sheet formation. Proc. Natl. Acad. Sci. USA 96, 3688 (1999).

[31] Stanger, H. E., Syud, F. A., Espinosa, J. F., Giriat, I., Muir, T. and Gellman, S. H.

Length-dependent stability and strand length limits in antiparallel 1-sheet secondary structure. Proc. NatL. Acad. Sci. USA 98, 12015 (2001).

[32] Ha, B. Y. and Liu, A. J. Counterion-mediated attraction between two like-charged rods. Phys. Rev. Lett. 79, 1289 (1997).

[33] Fandrich, M., Fletcher, M. A. and Dobson, C. M. Amyloid fibrils from muscle myoglobin. Nature 410, 165-166 (2001)

[34] B. Davidson & G.D. Fasman, The conformational transitions of uncharged poly-llysine. c-helix-random coil- structure, Biochemistry 6, 1616 (1967).

[35] S.Y.C. Wooley & G. Holzwarth, Intramolecular 1-pleated-sheet formation by poly-l-lysine in solution, Biochemistry 9, 3604 (1970).

[36] R. Hartman, R.C. Schwaner & J. Hermans, Beta poly(l-lysine): a model system for biological self-assembly, J. Mol. Biol. 90, 415 (1974).

[37] P.T. Lansbury, Evolution of amyloid: what normal protein folding may tell us about fibrillogenesis and disease, Proc. Natl. Acad. Sci. USA 96, 3342 (1999).

[38] Koo, E. H., Lansbury P. T. Jr. and Kelly, J. W. Amyloid diseases: abnormal protein aggregation in neurodegeneration, Proc. NatL. Acad. Sci. USA 96, 9989

(1999).

[39] O.N. Antzutkin, J.J. Balbach, R.D. Leapman, N.W. Rizzo, J. Reed & R. Tycko,

Multiple quantum solid-state NMR indicates a parallel, not antiparallel, organization of 1-sheets in Alzheimer's 1-amyloid fibrils, Proc. Natl. Acad. Sci.

USA 97, 13045 (2000).

[40] Ingram, V. Alzheimer's disease. American Scientist 91, 312 (2003)

82

[41] Singh, A., Wong, E. M. and Schnur, J. M. Toward the rational control of nanoscale structures using chiral self-assembly: diacetylenic phosphocholines.

Langmuir 19, 1888-1898 (2003)

[42] Hartgerink, J. D., Beniash, E. and Stupp, S. I. Self-assembly and mineralization of peptide-amphiphile nanofibers Science 294, 1684 (2001).

[43] Stupp, S. I. and Braun, P. V. Molecular manipulation of microstructures: biomaterials, ceramics, and semiconductors. Science 277, 1242 (1997).

[44] Zubarev, E. R., Pralle, M. U., Li, L. and Stupp, S. I. Conversion of supramolecular clusters to macromolecular objects. Science 283, 523 (1999).

[45] Stupp, S I., LeBonheur, V., Walker, K., Li, L. S., Huggins, K. E., Keser, M. and

Amstutz, A. Supramolecular materials: self-organized nanostructures. Science

276, 384 (1997).

[46] Branden, C. and Tooze, J. Introduction to Protein Folding; Garland Publishing,

Inc. New York, 1991.

[47] Dobson, C. M. The structural basis of protein folding and its links with human disease. Phil. Trans. R. Soc. Lond. B 356, 133-145 (2001).

[48] Gazit. E. A possible role for i-stacking in the self-assembly of amyloid fibrils.

FASEB J. 16, 77-83 (2002).

[49] Serio, T. R., Cashikar, A. G., Kowal, A. S., Sawicki, G. J., Moslehi, J. J., Serpell,

L., Arnsdorf, M. F. and Lindquist, S. L. Conversion and the replication of conformational information by a prion determinant. Science 289, 1317 (2000).

[50] Lomakin, A., Doo Soo, C., Benedek, G. B., Kirschner, D. A. and Teplow, D. B.

On the nucleation and growth of amyloid P-protein fibrils: detection of nuclei and quantitation of rate constants. Proc. Natl. Acad. Sci. USA 93, 1125 (1996).

[51] Lomakin, A., Teplow, D. B., Kirschner, D. A. and Benedek, G. B. Kinetic theory of fibrillogenesis of amyloid P-protein. Proc. Natl. Acad. Sci. USA 94, 7942

(1997).

[52] Chiti, F., Webster, P., Taddei, N., Clark, A., Stefani, M., Ramponi, G. and Dobson,

C. M. Designing conditions for in-vitro formation of amyloid protofilaments and fibrils. Proc. Natl. Acad. Sci. USA 96, 3590 (1999).

[53] Kelly, J. W. The alternative conformations of amyloidogenic proteins and their multi-step assembly pathways. Curr. Op. Struct. Bio. 8, 101 (1998).

[54] Betts. S. and King, J. A. A green light for protein folding. Nature Biotechnology

17, 637 (1999).

83

[55] Perutz, M. F., Pope, B. J., Owen, D., Wanker. E. E. and Scherzinger, E.

Aggregation of proteins with expanded glutamine and alanine repeats of the glutamine-rich and asparagine-rich domains of Sup35 and of the amyloid P-peptide of amyloid plaques. Proc. Natl. Acad. Sci. USA 99, 5596 (2002).

[56] Serpell, L. C., Blake, C. C. F. and Fraser, P. E. Molecular structure of a fibrillar

Alzheimer's AP fragment. Biochemistry 39, 13269 (2000).

[57] Harper, J. D., Wong, S. S., Lieber, C. M. and Lansbury, P. T. Jr. Observation of metastable AP amyloid protofibrils by atomic force microscopy. Chemistry and

Biology 4, 119 (1997).

[58] Serpell, L. C., Berriman, J., Jakes, R., Goedert, M. and Crowther, R. A. Fiber diffraction of synthetic x-synuclein filaments shows amyloid-like cross-3 conformation. Proc. Natl. Acad. Sci. USA 97, 4897 (2000).

[59] Ionescu-Zanetti, C., Khurana, R., Gillespie, J. R., Petrick, J. S., Trabachino, L. C.,

Minert, L. J., Carter, S. A. and Fink, A. L. Monitoring the assembly of lg lightchain amyloid fibrils by atomic force microscopy. Proc. Natl. Acad. Sci. USA 96,

13175 (1999).

[60] Krejchi, M. T., Cooper, S. J., Deguchi, Y., Atkins, E. D. T., Fournier, M. J.,

Mason, T. L. and Tirrell, D. A. Crystal structures of chain-folded antiparallel betasheet assemblies from sequence-designed periodic polypeptides. Macromolecules

30, 5012 (1997).

[61] Yong W, Lomakin A, Kirkitadze MD, Teplow DB, Chen SH, Benedek GB,

Structure determination of micelle-like intermediates in amyloid beta-protein fibril assembly by using small angle neutron scattering. Proc. Natl. Acad. Sci. USA 99,

150 (2002).

[62] Kad NM, Myers SL, Smith DP, Smith DA, Radford SE, Thomson NH.

Hierarchical assembly of beta(2)-microglobulin amyloid in vitro revealed by atomic force microscopy. J. Mol. Bio. 330, 785 (2003).

[63] Hamad-Schifferli, K., Schwartz, J. J., Santos, A. T., Zhang, S. and Jacobson, J. M.

Remote electronic control of DNA hybridization through inductive coupling to an attached metal nanocrystal antenna. Nature 415, 152 (2002).

[64] Weckesser, J., De Vita, A., Barth, J. V., Cai, C. and Kern, K. Mesoscopic correlation of supramolecular chirality in one-dimensional hydrogen-bonded assemblies. Phys. Rev. Lett. 87 (2001).

[65] Bibette, J., Mason, T. G., Gang, H. and Weitz, D. A. Kinetically induced ordering in gelation of emulsions. Phys. Rev. Lett. 69 (1992).

84

CHAPTER 3

A MEAN-FIELD DESCRIPTION

OF FIBER SELF-ASSEMBLY

3.1 Introduction

The mechanism of $-sheet fiber formation is likely to be complex. As described in the previous chapter, a population of identical peptide molecules can self-organize into a variety of coexisting structures that evolve to form mature fibers. Fiber formation may therefore involve fibril self-association and supercoling [1-3]. In the case of the Alzheimer's peptide, for example, experiments with radiotracers revealed that at least two stages are involved in fiber formation: a fully reversible phase, in which monomers can loosely associate with the fiber tip, followed by irreversible locking if the time of contact is sufficient [4]. Even the folding mechanism of a short polypeptide chain into a single 1-sheet dimer (called P- hairpin) is not completely understood and has been the subject of intense experimental [5,6] and theoretical [7-9] study. Given the complexity of

D-sheet

fiber formation, it is important to find principles that can (1) describe its most general features, (2) guide the interpretation of experimental results and (3) aid in the design of peptide-based biomaterials. This chapter was motivated by such considerations.

The concept of nucleation and growth is recurrent in the literature on amyloid [10-19]: according to this model, the first $-sheet nucleus (formed by coalescing monomers) acts as a template upon which subsequent peptide addition is facilitated, resulting in fiber elongation.

Here we use this framework to develop a simple, quantitative model that captures the most

85

general features of

P-sheet

fiber formation. Three main assumptions form the basis of our model: a) nucleation of a fiber occurs when two monomers within interaction distance arrange themselves irreversibly into a 1-sheet dimer. This event is characterized by a nucleation rate constant Kn; b) the formed -sheet nucleus acts as a template onto which other monomers can irreversibly bind, resulting in fiber elongation. This second phase is characterized

by an elongation rate constant Ke. It is reasonable to assume that monomer addition to the template is faster than nucleation, therefore in general K, > K"; c) fibers grow by monomer addition, not by coalescence.

In this chapter we describe peptide self-assembly using these assumptions and the mean-field approximation developed by Smoluchowski in order to derive a relation between the average fiber length and the fundamental rate constants of the process.

3.2 Smoluchowski coagulation theory

With the assumptions postulated above, the process of 1-sheet fiber formation can be placed within the general framework of irreversible homopolymerizations: identical monomers diffusing in a medium coalesce over time to form larger aggregates. Such processes are controlled by two factors: the rate at which particles bind once they are in contact and the rate at which they diffuse throughout the medium. The first quantitative description of such phenomena was developed by Smoluchowski [20], who proposed the following coagulation equations:

86

dP~l~t jE dt

2 i+k=l

K, Qi, k)P(i, t P(k, t) - PQ, t ) K, (1, k)P(k, t),

= where P(lt) is the population (number per unit volume) of particles of size I at time t and K(1,k) is the overall rate constant (including both diffusion and reaction) at which l-mers bind with k- mers. The time rate of change in the number of polymers of length I is due to coalescence of smaller polymers whose weights add to 1, minus those polymers of length I that coalesce with any other one. The factor 1/2 in front of the first term in the right hand side accounts for double counting, except in the case of i = k, where it accounts for the probability of collision of identical objects of size i, which is proportional to:

2

IP(i, t)[P(i, t) -1]~

2

P(i, t)2.

For the vast majority of polymers (including linear polymers) it can be shown that the kinetics of the process described by the Smoluchowski's equations are driven into the diffusioncontrolled regime, regardless of the magnitude of the reaction or diffusion constants [21]. This happens because the diffusivity of polymers usually vanishes with molecular weight faster than the first inverse power of their gyration radius. In formal terms:

KDipfti (, k) oc (R, + Rk

XD

1 + Dk), where R is the gyration radius and D the diffusion coefficient of particles of size I and k. If D falls with molecular weight faster than 1/R, the diffusion rate constant will become smaller as time progresses and the process will necessarily become diffusion-controlled. The overall reaction constant, including diffusion and reaction, can be found using the Collins-Kimball relation [22]:

87

1 1 1

KO

KRe action KDffi

Although valid for most aggregation processes, the above conclusion might not apply to the case of our peptides. Experimental evidence suggests that elongation of n-sheet fibrils occurs by monomer addition to existing fibers [23], not by fiber coalescence. If this is true, since the diffusivity of single monomers remains high during the whole process, we can safely assume that diffusion is not a limiting factor. For example, the time required for a KFE8 peptide to diffuse by 10 nm (the initial intermolecular distance at 1 mg/mL concentration) is approximately

10-7 s. Even assuming that at later stages the average molecular distance will increase by a factor of 10 (corresponding to a decrease in monomer concentration by a factor of 1,000), the diffusion time required to overcome intermolecular distances would still be 10-5 s: very small when compared to the observed time scale of fiber self-assembly.

3.3 Derivation of average fiber length

Assuming that fiber elongation can result only from monomer addition, the

Smoluchowski equations can be simplified as follows:

dP(l, t) -K, (1,I)P(l, t) 2 dtk=

P(l, t)Z K, (1, k)P(k, t), dP(2 dt 2

-)

dP(3,t) =K,(1,2)P(l,t)P(2,t)-

K,(1,3)P(l,t)P(3,t), dt

This is an infinite system of coupled differential equations. Following the postulates given at the beginning of the chapter, let us assume that only two constants are relevant to this

88

process: the rate at which peptide monomers can nucleate a $-sheet, Kn, and the rate at which peptides can permanently lock onto the tip of an existing fiber: Ke (which determines the rate of fiber elongation). Therefore Kn = KO (1,1) and Ke = KO (1,1) with 1 = 2,3,... Furthermore, let us confine our attention to the average fiber length', which can be defined as the ratio of the initial number of monomers to the final number of fibers. The total number of fibers is defined as F(t) = P(k, t). Notice that, starting from the second differential equation, each second term k=2 in the right hand side is equal to the first term in the right hand side of the following equation.

We can therefore write: dF(t) K (1,)p(, t)2 dt 2

0 and the infinite system of equations can now be written as: dM =-Kn

M dt

dF I KM2 dt 2

2 KeMF

(0) where we have defined M = P(1,t) (the population of monomers). In the first equation each monomer can disappear because of binding to any other monomer or to an existing fiber. The second equation states that fibers can only be created when two monomers bind: the factor comes from the number of possible ways of combining two monomers to form a dimer: solution.

1 While it is not necessary to restrict the analysis in this way, it is useful for obtaining a closed-form

89

1

-M(M -1).

2

To extract useful information from these differential equations, let us rewrite the system in this way: dM Km2 dt dF 1 dt 2n

2

I

+Ke F

Kn M) and make the following change of

FK variables: Z =l+K F, where K =

Ke

M K'

From these definitions we can write:

M

F = -(Z -1)

K

F = M (z

K K where dots indicate time derivatives.

Information from both differential equations can now be used to write the following relation:

-

K

(z

M +

K

----

=0,

2 Z or

A

M

ZZ

-+1

Z(Z -1)+ 1 K

2

=0 .

90

The above algebraic steps involved multiplications or divisions by Z or M: such operations are allowed as both of these functions are greater than zero during the evolution of the system. The above differential relation can also be expressed as: dt

[Iln M + g(Z)]=0, where g(Z)=f Z

Z(Z -1)+- K

2 dZ

In other words, the quantity InM + g(Z) remains constant throughout the process and can be thought of as a constant of motion. The indefinite integral can be evaluated, yielding arctan 2Z-g

g(Z) = -

V2K-I

+ Ilog(2Z2 - 2Z +K)

2 and the initial conditions M (t =0) = MO and Z(t =0) = I can be used to evaluate this constant of motion, yielding the following expression:

In M + g(Z)=ln MO + g(l), (1) where MO is the initial number of monomers per unit volume in the system and arctan g~l)=r 2K -1 -) g(')=

42K-1

1

+-log

2

K.

91

We are interested in the average fiber length:

(L)=MO

F

=K M-

(Z -1)

M using equation (1), this expression takes the form:

(L)

= K exp[g(Z)- g()].

Z-1

Substituting values, we obtain:

(L) = V-K exparctan

p2K

-1

V2K - I

-2Z 2 -2Z + K

Z-1 e arctan

C2

2 -

2

K -1

-[ 2K -I

This is our final expression relating average fiber length to the ratio of the rate constants and to the variable Z(t).

We are interested in the behavior of the system at long times, when the whole process has taken place and all monomers are consumed. The limit for Z -+ oo of expression (2) yields the final average fiber length:

(L) =2K exp ff-2arctan -

V2K -1

2,2K -I

I

Figure 3-1 shows a graph of this function.

(3)

(2)

92

Mean-Field Approximation

1000

A

V

10

1 10 100

Ke / Kn

1000 10000

Figure 3-2. Average fiber length (expressed as number of monomers per fiber; o is particle diameter) as a function of elongation and nucleation rate constants. This relation was obtained using a Smoluchowski mean-field approximation of fiber nucleation and growth.

In the limit K >> 1, when fiber nucleation is a much slower process than elongation, this expression becomes:

(L)_ =

K

As expected, when the nucleation rate constant is small, very few fibers are formed and all the monomers add onto a small number of fibers, resulting in a larger average length. Conversely, when nucleation and elongation are characterized by similar rate constants, the system will reach

93

equilibrium with a population of many short fibers. From equation (3), when K -+1 the average fiber length becomes (L) = V2exp 4.

Equation (4) describes a general feature of fiber formation: regardless of the speed at which equilibrium is reached, the final fiber length distribution carries a signature of the characteristic rate constants involved. For example, if nucleation is relatively easy compared to elongation, the final length distribution will be characterized by many short fibers. Conversely, if nucleation is extremely rare, only very few long fibers will be created. Equation (4) can also be used to extract the fundamental rate constants of the process if one can measure average fiber length and fiber growth rate. From the first equation in (0), one can write:

M M

K=- -K

.

(5)

Average fiber length is defined as (L) =

M

0

-M

F

, from which:

S

F F 2 (MO -M).

At long times the ratio of monomers to fibers decreases and few new fibers are created. As a consequence, the last term in equation (5) can be neglected (the nucleation rate is also assumed to be small compared to elongation) as well as the last term in equation (6). We can therefore write:

MKe (L).

(6)

94

From equation (4) one can infer: K = 2

K,

2

.

Substituting these two last expressions in the second equation of (0), one can finally write:

1 dF (7)

M dt (L)

2

From this expression, and from macroscopic measurements of fiber length and growth rate, one can estimate the average number of nucleation events per monomer per unit time.

3.4 Discussion

Using the relations derived above, although based on very simple assumptions, it is possible to elucidate the molecular processes involved in the formation of $-sheet fibers. As an example, consider the case of the KFE8 peptide. The AFM images presented in Chapter 2 reveal the presence of helical ribbons approximately 100 nm long in the early stages of self-assembly.

According to our current model of molecular packing [24], there are approximately 100 molecules per helical turn in such ribbons. Given the 20 nm pitch, there are approximately 500 molecules in each helical ribbon when it is 100 nm long. In the final stages of self-assembly only tubular filaments remain, of length in the order of microns. Assuming that such long fibers are collapsed helical ribbons of pitch 10 nm (thus containing twice as many molecules per unit length), each fiber contains at least 10,000 molecules (assuming it is 1 ptm long). The process completes in approximately 10,000 s: we can then roughly estimate a rate of fiber growth of I molecule per second per fiber. Substituting these numbers in equation (7) one finds that, on average, each monomer nucleates a fiber at the rate 10-"s -. We also know that each monomer

95

diffuses by the average intermolecular distance in approximately 10-7 s. The ratio of these two quantities reveals that, on average, a peptide molecule collides 1015 times with other molecules before nucleating a 1-sheet. As a comparison, Flory estimated 1013 collisions [25] between functional groups in a chemical polymerization reaction. In the case of 1-sheets, it is not surprising that this number is higher, as many more degrees of freedom are involved in finding the correct mutual orientation to originate a -sheet. These considerations are based on the assumption that a

@-sheet

nucleus is formed by two monomers coming together. This might not always be the case, as aggregates of many monomers (driven together by their hydrophobic side chains) might form first [26] and facilitate the finding of the correct mutual orientation for a

1-

sheet. Moreover, it is possible that such unstructured oligomers may come together to nucleate a new fiber: in other words, oligomers and not monomers might be acting as building blocks. If this were the case, our model could be extended to include this phenomenon.

Relation (4) can also be used as a principle for the design of biomaterials. Assume, for example, that matrices composed of long fibers are desired. In order to increase average fiber length one can choose molecules for which K, is small: choosing molecules for which 1-sheet nucleation is difficult should result in fewer, longer fibers. For example, it is known that segments of a polypeptide chain within a $-sheet (1-strands) are, on average, 7-8 amino acid long and that longer segments have a decreased propensity to form 1-sheets [27]. As a consequence, peptide molecules longer than 8 amino-acids would be expected to have a lower K and to form longer fibers. Conversely, if a population of many short fibers is desired, it is necessary to choose molecules for which nucleation is easier. To this regard it is interesting to compare Figures 2-2 and 2-28, acquired after approximately the same assembly time in the same conditions: KWE8 (for which an increased hydrophobic effect should facilitate peptide self-

96

association) formed a collection of many short fibers, while KFE8 formed fewer and longer fibers. This result seems in agreement with what can be predicted from equation (4).

Nevertheless, care must be exercised before drawing this conclusion. In particular, it is the ratio of the two rate constants that matters in relation (4): in order to obtain long fibers, one has to choose molecules for which fiber nucleation is relatively more difficult than elongation.

Predicting the dependence of such ratio on molecular shape is difficult. Moreover, (4) is strictly valid only when the system has reached equilibrium. With these caveats in mind, the quantitative model presented here can be used as a first step for interpreting experimental results and for designing new biomaterials. If the choice of molecular shape is limited, equation (4) suggests other strategies for obtaining matrices of pre-specified properties. If long fibers are required, for example, one can choose conditions that disfavor nucleation at first; then, after a few nuclei are formed, change assembly conditions so as to increase Ke.

How does this theory compare to other quantitative models describing amyloid fiber formation? Benedek and coworkers developed a kinetic theory of fiber nucleation and growth to interpret their light scattering experiments on the amyloid AP peptide [13,14]. Such theory is based on the assumption that fibers can only nucleate on seeds, which can consist of either external particles or micelles formed by monomers when the concentration is above a certain threshold (critical micellar concentration, CMC). It is likely that globular aggregates of monomers provide a conducive environment for the formation of -sheet fibers. Nevertheless, according to this model, nucleation cannot occur at concentrations below CMC (which they found to be 0.1 mM), unless external particles are present. The theory developed in our research assumes that nucleation occurs when two monomers hit each other in the correct orientation to form a

P-sheet

structure. It therefore applies at all concentrations and does not require any

97

external seed for nucleation, which can also occur at very low concentrations (albeit with very low probability). On the other hand, our theory does not account for micelle formation. Both theories describe important aspects of $-sheet nucleation and are most applicable in different conditions. A model describing the elongation (not the nucleation) of f-sheet fibrils has also been developed [28] that highlights the existence of an intermediate step, where monomers are deposited in amorphous state on the fibril. Such phase is reversible: monomers can go back in solution and only lock irreversibly onto the fiber when the time of contact is sufficient. For the sake of simplicity, this aspect of amyloid formation was not considered in our model.

The mean-field approximation developed in this chapter assumes that each monomer has immediate access to any other monomer or fiber in the system: in other words, that the peptide solution is well-mixed. This is equivalent to assuming that monomer diffusion is much faster than reaction (either f-sheet nucleation or monomer addition to existing filaments). This assumption is valid for most of the experiments presented in Chapter 2, where the pH was kept low to decrease the rate of self-assembly and observe fiber formation. A comparison of the typical diffusion time (microseconds) with the typical self-assembly time in those experiments

(hours) confirms this. Nevertheless, it is important to note that this approach may no longer be valid in conditions of rapid gelation. In typical tissue engineering applications, for example, self-assembly is induced by addition of electrolytes: in such conditions the reaction time may not be the bottleneck any more, as diffusion time could be of similar magnitude. When diffusion becomes a limiting factor, a mean-field approximation would be inadequate. To examine filament formation and growth under these more general conditions requires a different approach: Chapter 4 will present simulations aimed at overcoming the limitations of a mean-

98

field approximation and will explore fiber self-assembly when diffusion and reaction times are of comparable magnitude.

Before concluding, it is useful to consider ways of improving the model presented here.

Our theory is based on two fundamental rate constants (nucleation and elongation) which are taken as external parameters. It should be possible to overcome this limitation by expressing such constants in terms of a more detailed model for the mechanism of -sheet nucleation and elongation. For example, one could imagine a molecule as a little brick, with hydrogen bonding sites on each side. By assigning a given energy change to each satisfied bond and by taking into account all possible mutual configurations, it should be possible to compute a nucleation rate constant using statistical mechanics considerations. A similar reasoning could also be applied to fiber elongation. A similar concept has been used to describe

P-hairpin

formation [5].

3.5

Conclusion

By using a set of simplifying assumptions and recasting fiber self-assembly in a

Smoluchowski-type mean-field approach, we have derived a relation between average fiber length and two rate constants assumed to be fundamental in the process. In the limit of fiber nucleation being a slower process than fiber elongation, and reaction time being the rate-limiting factor, the average fiber length at equilibrium can be expressed as:

(L) =

K

K,

This relation is useful, for example, in estimating the nucleation rate constant from knowledge of fiber length and rate of elongation. In the case of the KFE8 peptide, for example, these considerations allowed an estimate of the average number of collisions required for two

99

molecules to nucleate a n-sheet. Such relation can also be used as an aid in the design of biomaterials. This theory applies only when molecular diffusion is much faster than fiber nucleation (e.g. in the case of the experiments performed to elucidate fiber formation). When diffusion time and reaction time are comparable, a mean-field approximation is not valid any more (for example when fast gelation is induced by addition of electrolytes). A different approach is needed in these cases, which will be presented in the next chapter.

100

References for Chapter 3

[1] Nichols, M. R., Moss, M. A., Reed, D. K., Lin, W.-L., Mukhopadhyay, R., Hoh, J.

H. and Rosenberry, T. R. Growth of P-amyloid(1-40) protofibrils by monomer elongation and lateral association. Characterization of distinct products by light scattering and atomic force microscopy. Biochemistry 41, 6115 (2002).

[2] Ionescu-Zanetti, C., Khurana, R., Gillespie, J. R., Petrick, J. S., Trabachino, L. C.,

Minert, L. J., Carter, S. A. and Fink, A. L. Monitoring the assembly of Ig lightchain amyloid fibrils by atomic force microscopy. Proc. Natl. Acad. Sci. USA 96,

13175 (1999).

[3] Harper, J. D., Wong, S. S., Lieber, C. M. and Lansbury, P. T. Jr. Observation of metastable AP amyloid protofibrils by atomic force microscopy. Chemistry &

Biology 4, 119 (1997).

[4] Esler, W. P., Stimson, E. R., Jennings, J. M., Vinters, H. V., Ghilardi, J. R., Lee, J.

P., Mantyh, P. W. and Maggio, J. E. Alzheimer's disease amyloid propagation by a template-dependent dock-lock mechanism. Biochemistry 39, 6288 (2000).

[5] Mufnoz, V., Thompson, P., Hofrichter, J. and Eaton, W. A. Folding dynamics and mechanism of 1-hairpin formation. Nature 390, 196 (1997).

[6] Maness, S. J., Franzen, S., Gibbs, A. C., Causgrove, T. P. and Dyer, R. B.

Nanosecond Temperature Jump Relaxation Dynamics of Cyclic $-Hairpin Peptides.

Biophys. J. 84, 3874 (2003).

[7] Dinner, A. R., Lazaridis, T. and Karplus, M. Understanding -hairpin formation.

Proc. Natl. Acad. Sci. USA 96, 9068 (1999).

[8] Mufioz, V., Henry, E. R., Hofrichter, J. and Eaton, W. A. A statistical mechanical model for P-hairpin kinetics. Proc. Natl. Acad. Sci. USA 95, 5872 (1998).

[9] Pande, V. J. and Rokhsar, D. S. Molecular dynamics simulations of unfolding and refolding of a 1-hairpin fragment of protein G. Proc. Natl. Acad. Sci. USA 96, 9062

(1999).

[10] Koo, E. H., Lansbury, P. T. Jr. and Kelly, J. W. Amyloid diseases: abnormal protein aggregation in neurodegeneration. Proc. Natl. Acad. Sci. USA 96, 9989

(1999).

[11] Lansbury, P. T. Jr. Evolution of amyloid: What normal protein folding may tell us about fibrillogenesis and disease. Proc. Natl. Acad. Sci. USA 96, 3342 (1999).

101

[12] Serio, T. R., Cashikar, A. G., Kowal, A. S., Sawicki, G. J., Moslehi, J. J., Serpell,

L., Arnsdorf, M. F. and Lindquist, S. L. Conversion and the replication of conformational information by a prion determinant. Science 289, 1317 (2000).

[13] Lomakin, A., Doo Soo, C., Benedek, G. B., Kirschner, D. A. and Teplow, D. B. On the nucleation and growth of amyloid f-protein fibrils: detection of nuclei and quantitation of rate constants. Proc. Natl. Acad. Sci. USA 93, 1125 (1996).

[14] Lomakin, A., Teplow, D. B., Kirschner, D. A. and Benedek, G. B. Kinetic theory of fibrillogenesis of amyloid P-protein. Proc. Natl. Acad. Sci. USA 94, 7942 (1997).

[15] Dobson, C. M. The structural basis of protein folding and its links with human disease. Phil. Trans. R. Soc. Lond. B 356, 133-145 (2001).

[16] Chiti, F., Webster, P., Taddei, N., Clark, A., Stefani, M., Ramponi, G. and Dobson,

C. M. Designing conditions for in-vitro formation of amyloid protofilaments and fibrils. Proc. Natl. Acad. Sci. USA 96, 3590 (1999).

[17] Kelly, J. W. The alternative conformations of amyloidogenic proteins and their multi-step assembly pathways. Curr. Op. Struct. Bio. 8, 101 (1998).

[18] Betts. S. and King, J. A. A green light for protein folding. Nature Biotechnology

17, 637 (1999).

[19] Perutz, M. F., Pope, B. J., Owen, D., Wanker. E. E. and Scherzinger, E.

Aggregation of proteins with expanded glutamine and alanine repeats of the glutamine-rich and asparagine-rich domains of Sup35 and of the amyloid P-peptide of amyloid plaques. Proc. Natl. Acad. Sci. USA 99, 5596 (2002).

[20] Von Smoluchowski, M. Z. Phys. Chem. 92, 129 (1917).

[21] Oshanin, G. and Moreau, M. Influence of transport limitations on the kinetics ofhomopolymerization reactions. J. Chem. Phys. 102, 2977 (1995).

[22] Collins, F. C. and Kimball, G. E. Jr. J. Colloid Sci. 4, 425 (1949).

[23] Hasegawa, K., Ono, K., Yamada, M. and Naiki, H. Kinetic modeling and determination of reaction constants of Alzheimer's P-amyloid fibril extension and dissociation using surface plasmon resonance. Biochemistry 41, 13489 (2002).

[24] Hwang, W., Marini, D. M., Zhang, S. and Kamm, R. D. Supramolecular structure of helical ribbons self-assembled from a -sheet peptide. J. Chem. Phys. 118, 389

(2003).

[25] Flory, P. J. Principles ofpolymer chemistry. Cornell University Press, Ithaca, p. 77

(1953).

102

[26] Bitan, G., Kirkitadze, M., Lomakin, A., Vollers, S. S., Benedek, G. B. and Teplow,

D. B. Amyloid f-protein (AP) assembly: AP40 and AP42 oligomerize through distinct pathways. Proc. Nati. Acad. Sci. USA 100, 330 (2003).

[27] Stanger, H. E., Syud, F. A., Espinosa, J. F., Giriat, I., Muir, T. and Gellman, S. H.

Length-dependent stability and strand length limits in antiparallel 1-sheet secondary structure. Proc. Nati. Acad. Sci. USA 98, 12015 (2001).

[28] Massi, F. and Straub, J. E. Energy landscape theory for Alzheimer's amyloid peptide fibril elongation. Proteins: structure, function and genetics 42, 217 (2001).

-

103

104

CHAPTER 4

SIMULATION OF FIBER SELF-ASSEMBLY

4.1 Introduction

The simple model of fiber formation introduced in Chapter 3 allowed to derive an analytical relation between the average length of self-assembled f-sheet fibers and the rate constants (of nucleation and elongation) assumed to be fundamental in the process. The scope of predictions that such a model allows is limited for three main reasons: (1) interaction forces among monomers and fibers are not represented; (2) the system is assumed to be well mixed; (3)

P-sheet fibers are supposed to nucleate from two molecules coming into contact (nucleation from micelles is neglected). Since our ultimate goal is in developing predictive tools for the design of peptide-based biomaterials, it is important to elucidate fiber self-assembly in less abstract conditions. In this chapter a computational tool is proposed to investigate the process in conditions that a mean-field approximation cannot capture. In particular, limitations (1) and (2) will be relaxed.

4.2 Simulation model

The analysis developed in Chapter 3 assumes that monomer diffusion is not a limiting factor: each monomer has immediate access to any other monomer or fiber in the system. In other words, diffusion is assumed to be much faster than reaction, so that nucleation and elongation rates govern the process. Here this assumption is removed and the motion of interacting molecules in solution is described explicitly via Brownian dynamics.

105

As explained in the previous chapter, the prevalent framework used to describe f-sheet fiber formation is the concept of nucleation and growth. Nucleation is the rate-limiting step: once achieved, subsequent addition of monomers is faster (because the nucleus templates 1-sheet folding) and results in fiber elongation [1-5]. The folding of a polypeptide chain into a f-sheet conformation has been the subject of intense study [6-9]. Compared to the -helix, for which several factors are known to promote its formation and stability [10], our understanding of $- sheets is much less advanced [11], even though both structures were discovered at about the same time [12,13]. Given the complexity of this phenomenon, a precise description of how a

@-

sheet nucleus is formed would itself require very sophisticated simulations [14] and is beyond the scope of this study. The model presented in this chapter describes P-sheet nucleation as a stochastic process that can occur when two molecules spend enough time in contact with one another. Once a nucleus is formed, addition of other monomers is facilitated by the formed

1sheet template. This is a simplified picture of what may happen in reality, where formation of globular aggregates (micelles) could facilitate nucleation [15] and fiber growth may result from oligomers addition [16].

The representation of peptide molecules is a fundamental choice in the model. Here we are interested in describing all stages of self-assembly: from uniformly distributed monomers to a macroscopic network. For this purpose, a detailed description of atomic motions within each molecule would be computationally expensive and probably unnecessary; a description of center-of-mass motion is sufficient. Each molecule is therefore represented as a soft sphere.

This description of self-assembly is more detailed than a mean-field approximation and less detailed than an atomistic simulation: it is a coarse-grained (or mesoscale) description of the process, at a level of detail compatible with the scope of predictions we envisage.

106

The simulation tool developed here could in principle be used to investigate fiber selfassembly in conditions where reaction times are much longer than diffusion times (ideal mixing).

In practice, due to computing time limitations, it is most efficiently applicable when reaction times and diffusion times are of comparable magnitude. Given this constraint, we chose to represent fiber elongation as immediate binding upon contact between a nucleus (or a fiber end) and a free monomer. In other words, the reaction time for elongation is assumed to be much shorter than diffusion time. Coalescence of different fibers has been excluded from the model, as P-sheet fiber growth is thought to result from monomers (or oligomers) addition [17]. The following are the main assumptions in our model: a) Peptide molecules are represented as soft repulsive spheres of diameter c; diffusing in a three-dimensional medium of viscosity rj.

b) If two monomers come into contact, they can bind irreversibly to form a fiber nucleus

(a dimer): such outcome depends stochastically on contact time, through a nucleation rate constant K. c) Once a fiber nucleus is formed, subsequent addition of monomers happens immediately upon contact with such nucleus (or a longer fiber), provided that the adding monomer lies within a solid angle i with the fiber axis (Figure 4-1).

d) Bending stiffness is imposed on all formed fibers.

107

4PII-

V

Figure 4-1. Representation of the assumptions in our simulation model. Each peptide monomer is represented as a soft sphere diffusing in a three-dimensional medium. After nucleation of a $- sheet dimer, subsequent addition of monomers is possible at an angle 15 with the axis and results in fiber elongation.

Numerical simulations of polymer growth are challenging for several reasons: they must model reaction between monomers, keep track of the bonds that can form between any two molecules and describe the dynamics of the formed chains [18-20].

Simulations of living

polymers are related to the model presented here, even though such studies are primarily motivated by understanding systems at equilibrium, where bonds can continually break and reform [18-22]. In particular, the functional form of the molecular weight distribution in living polymers at equilibrium has been the subject of debate [20,21]. The simulations presented here, on the other hand, are aimed at understanding the process of nucleation and irreversible growth of fibers and its dependence on fundamental rate constants: a non-equilibrium situation. In fact, our simulations are stopped when the system reaches equilibrium.

A two-dimensional simulation of rodlike molecules with reactive ends was among the first polymer dynamics simulations to include reaction between monomers [23]. More recent simulation studies of nonequilibrium polymerization have used on-lattice models [24,25]. The closest simulation to the one presented here was developed by Akkermans and coworkers [26], who studied the process of polymerization as a means of obtaining realistic initial conditions for polymer dynamics. The

108

main difference with our model is the origin of polymerization: in their study a fraction of molecules is chosen as growth centers from the beginning, while here any two molecules at any time can nucleate a new fiber.

4.3 Simulation methods

Having chosen the level of detail at which to represent fiber self-assembly, it is now necessary to specify the laws governing the evolution of the system.

Before proceeding, it is useful to gather all relevant parameters that might influence fiber length distribution:

(L)=- f (Kn,

9K,, tO(b), T, 77, m, a, B,,1 0) (1) where <L> is the average fiber length, K, and Ke are the nucleation and elongation rate constants, t is time, <b> is the average distance between molecules (a measure of concentration),

T is temperature, 7 is the viscosity of the medium, m and aare the molecular mass and diameter,

Bs is the fibers bending stiffness and P is the solid angle from the fiber axis at which monomers are allowed to bind; f is the unknown function. A reduction in the number of governing parameters will be possible after specifying the equations of motion of the system, as explained in the following sections.

P-sheet nucleation

The kinetics of $-sheet nucleation can be described by considering an ensemble of N copies of the following thought experiment.

A pair of peptide monomers is kept within interaction distance until it nucleates a 1-sheet: N/t) denotes the number of experiments at time t in which the peptide couples folded into a 1-sheet conformation, while N,(t) denotes the

109

unfolded ones; obviously Nf (t) + Nu (t) = N . A natural assumption one can make is that, in any given infinitesimal time interval, the increase in the number of folded couples per unit time is proportional to the number of couples still unfolded, with proportionality constant K (nucleation rate constant): dNf dt

The solution to this differential equation

= Kn(N -Nf)

Nf (t) = N(l exp(- Kt)) allows to compute the probability Pn that a pair of molecules in contact for a time t has nucleated a n-sheet. Such probability can be expressed as the fraction of folded molecules over the total in the ensemble:

=n-1 -exp(-

Knt)

This relation will be used to describe nucleation events in the simulation algorithm. As explained before, we also assume that particles coming into contact with the extremity of a fiber will immediately and irreversibly bind to it, resulting in fiber elongation. This choice was necessary for reducing computation time and is equivalent to assuming that the reaction time for fiber elongation is much shorter than the typical diffusion time. Therefore, our model applies when the Damkohler number [27] based on elongation, defined as

Da telonfgationl

Dae = fi tdfio

110

is smaller then unity; as a consequence, only the parameter K, governs the system. In principle this simulation model could also be used to investigate conditions in which Da, > 1. In practice, due to computing power limitations, this is not efficiently feasible.

Brownian dynamics

Peptides are much larger than water molecules, yet they are small enough for their motion to be affected by the constant collisions with solvent molecules. The motion of a sufficiently small particle immersed in a fluid exhibits a random type of motion called

"Brownian motion" that results from constant collisions with fluid molecules. When a coarsegrained description of particle motion is desired, all these collisions can be lumped into a randomly fluctuating force acting on the center of mass of the particle. This description of

Brownian motion was proposed by Langevin [28] and will be adopted in our simulations.

The motion of each peptide molecule diffusing in aqueous solution obeys Langevin's equation:

2-

= 3fy

-{ ' + d(t), dt dt

(2)

where - is the position of the center of mass of the i-th peptide particle (i = 1, 2,

... ,

N) in threedimensional space, m is the mass of each particle,

4 is the drag coefficient and t is time. The term fj represents the force exerted on particle i by particle

j

and d(t) is the stochastic

(Brownian) force describing collisions with solvent molecules.

I11

In the numerical implementation of this equation, the magnitude of each Cartesian component of the Brownian force is extracted randomly at each time step from a uniform distribution:

F 6kTE 6kBj

At At where kB is Boltzmann's constant, T is the temperature of the fluid and At is the time step size

[29].

Interaction potential

Biological molecules interact through a vast array of forces, such as: excluded-volume, electrostatic, hydrophobic effect, etc.; a detailed representation of each of those forces would be a challenging computational problem in itself. As a first approximation, we represent here only excluded volume interactions by means of a truncated Lennard-Jones potential: unless a binding event occurs, particles coming into contact (both free and bound to fibers) interact as soft repulsive spheres. This interaction is captured by the term fj in equation (2) and is modeled as the repulsive part of a Lennard-Jones potential [30]: fi =-u(

0

,

12

/

6

() 4c0

,=

0., y> r

+, 0< r

0

112

where rij is the distance between particles i and j,

Y is the characteristic length scale for the

Lennard-Jones potential and r, = 2

6 is the potential minimum. The diameter of these soft spheres should rigorously be defined as the minimum separation distance between two particles at which interaction forces are present: r,. For the sake of simplicity, in all following considerations we will define the diameter of these particles as o a separation distance shorter than rm, at which repulsion forces are not negligibly small. The drag coefficient on each particle is therefore

4

= 37'qo.

The nucleation of a j-sheet (or the subsequent addition of monomers to a formed fiber) is represented in our model as permanent binding of two monomers. As described in the previous section, when two particles come into contact for a time interval At, they can bind with probability P =1- exp(- KAt), where K, is the nucleation rate constant. The potential energy profile for bound molecules is obtained by adding an attractive part to their existing repulsion

[31]. Such attractive component of the potential is the reflection of the repulsive Lennard-Jones.

For bound peptides k and 1, for example:

Ub(na

)

UL-J m, 0 < rk

This bond potential allows for a maximum inter-particle distance 2r,, and an equilibrium distance

rn and does not require any additional parameters.

Dimensional analysis

The fundamental parameters governing the behavior of a system can often be found by writing its equations of motion in non-dimensional form. The first step is to assume a set of

113

fundamental units of measurement. Since diffusion plays a central role in the process, it is convenient to choose units related to this time scale. The choice: c, kBT and

4 was the most convenient. The unit of time then becomes kBT and a dimensionless time and position vector can be defined as follows:

* = t kBT

r = -

By substituting t and T as obtained from the above relations into the Langevin equation, one can write: mk Td 2 F*

42U2 dt* 2 kBT jti dt* a T d BT

(2)

The fluctuating force (t), although having zero average value, scales as follows:

(t)~ 6kBT

-

6kBT

At

2At* kBT kBT

.a

6

The interaction force fy between particles can also be rewritten as: fij= -VUr=-

4 8c

1

14F/

8

114

Substituting these expressions into equation (2) and rearranging terms, one obtains a nondimensional form of the equations of motion governing this system: d

2 dt* 2

=8 dt

(3) where:

14 r 2

8 ri9 d*(t*)e-

6

E_ and kT mkT

Apart from the parameter X, capturing the ratio between the energy of the Lennard-Jones repulsion and thermal energy, the most important dimensionless parameter governing this system is Vf Substituting values from experimental conditions,

115

m lkDa = 1.661x0-4kg o j 2nm = 2x0-9 m kBT .381A

-

23 J x3O K = 4.142x0

0 K

2 1 J

-73 x1

m

2

1.

8 8 5 x1O-ik

S one finds y 5.0x 10

6

. This is a first indication that, for these peptides in aqueous solution, inertia might be neglected from the equation. To be formally correct, this conclusion needs to be based on a comparison of magnitude of the various terms in the equation. Substituting real numbers one finds that inertia is negligible when compared to viscous and Brownian forces, as long as the time scale of interest is greater than about 103 s; since our unit of time is approximately 2x1 08s, inertia can indeed be neglected. The non-dimensional form of

Langevin's equation revealed that the parameters m,kBT,4,a influence the evolution of the system in the combined form of . Moreover, the above considerations suggest that y should not be included in the analysis. The equations of motion governing our system then become: dt*

= +d and relation (1) takes the simplified non-dimensional form:

(4)

(L / 7)= t*,Da,,(b/ a), Bs ,A1 kBT where

116

1

Da

= K n 2 kBT is the Damkohler number based on nucleation: the ratio of reaction (nucleation) time to diffusion time. Since we have assumed that elongation is much faster than diffusion, the Damkohler number based on elongation does not enter into our analysis.

Bending stiffness is imposed on the fibers by applying restoring torques to make any three consecutive monomers collinear. Preliminary simulations showed that chain stiffness has negligible influence on fiber length distribution in the conditions studied here; for this reason, we did not conduct a thorough investigation of this issue. Other researchers have found a more pronounced effect at high concentrations, by means of two-dimensional on-lattice simulations

[32].

The solid angle 0 at which monomers are allowed to bind plays an important role in the process. To prove this statement, consider the extreme case where ) is close to zero. In this situation monomers would bind to fibers only when perfectly aligned with the fiber axis: a very unlikely event. This would lead to the formation of many dimers and very little subsequent elongation: an effect similar to decreasing the value of the elongation rate constant K. Since we have already assumed that particles should bind immediately upon contact with a fiber end, restricting d would be inconsistent with that assumption. We therefore used a value 19 = 900: the largest compatible with code stability. Larger values of this angle were tested, but resulted in very strong restoring torques (required to impose bending stiffness) leading to code instability.

The parameter A is the ratio between the characteristic energy of the Lennard-Jones potential and thermal energy. This would be a relevant parameter if we were using the full

117

interaction potential; since we are using only the repulsive part to represent excluded volume interactions, this parameter is not expected to have an effect.

Given the above considerations, the final form of relation (1) is as follows:

(L /o)

= D(t*,Da,(b /a)).

Simulation results will be presented in this framework.

Integration

Equations (4) were integrated using the Euler forward method [33]:

*(t* + At*)= *(*) + 48J $j(t*)

The time step size used for all the simulations presented here was At* =0.0001. This value was circa an order of magnitude smaller than the largest stable and was chosen after testing for convergence; it is also in line with values recommended in the literature on Brownian dynamics algorithms [34].

Initial and boundary conditions

At the beginning of each simulation monomers were assigned to a regular cubic lattice.

To ensure random initial conditions, nucleation events were allowed to occur only after the particles diffused randomly for a distance corresponding to half their initial separation. To mimic the behavior of an infinite system, periodic boundary conditions were enforced: any particle exiting the simulation box re-enters from the opposite side. Simulations were stopped when less than 0.5% of initial monomers was still unbound. This condition was chosen to save

118

computation time, as the time required for the last few monomers to bind to existing fibers was

highly variable and could take a substantial fraction of the whole simulation, without adding much new information.

Testing the algorithm

Unlike molecular dynamics, where there are conserved quantities that can be used to check the validity of the integration scheme, in Brownian dynamics the most important test is to check the behavior of the mean-square displacement of the particles as a function of time [34].

By rewriting the diffusion law

(r2(t)) = 6Dt,

(where D

= kT is the diffusion coefficient) using our chosen set of units we obtain r*(t*)) = 6t*.

Figure 4-2 shows the result of such test. In the case of particles interacting with one another, the coefficient of self-diffusion is reduced.

119

Diffusion of Brownian Particles

N =8000, < b /a >= 2.0, At

= 10-4

60

50

A

40

*

V

30

20

T

6

*

*

10

0

0 1 2

-

3 4

Theoretical MSD

No excluded-volume interactions

Lennard-Jones potential

5 6

Dimensionless Time [kTt/((G 2 A

7 8 9

Figure 4-2. Testing the Brownian dynamics algorithm with diffusing particles (no fiber formation). As expected, interacting particles at high concentration have a reduced self-diffusion coefficient.

10

System size

An important choice in molecular dynamics simulations is the appropriate size of the system: it should be large enough to prevent interaction of fibers with themselves (due to periodic boundary conditions) and small enough to allow acceptable computation time. To make this choice, a test was performed in the worst possible scenario, i.e.: very short inter-particle distance ((b /c) =1.5, to maximize interaction) and conditions that promote the formation of very long fibers (Da,, =100). The results of such test are shown in Figure 4-3.

120

Test for Size Effects

<b/iY>= 1.5,At = 10 4, Da = 100, average of 4 runs

70

65

--

A

60

V

55

50

45; I

100

I

1000

Particles in the simulation box

I

10000

Figure 4-3. Testing the simulation algorithm for size effects. Error bars are standard deviations between different runs. As expected, standard deviation decreases with system size: a larger number of particles in the simulation yields more reliable results.

100000

The graph shows how increasing system size leads to less variation among different runs and therefore to more reliable results. Since the simulation time for the largest size tested (N =

27,000) was approximately 1 day per run on a 1.6 GHz CPU, we chose to run our simulations with N = 8,000 particles. At this size and in the worst possible scenario the standard deviation in fiber length among different runs was approximately 5%. A similar test was also conducted in less extreme conditions ((b/-) = 2.0, Da = 10, Figure 4-4).

121

Test for Size Effects

< b / >= 2.0, At* = 10-

4

, Da= 10, Average of 4 runs

20

A 14

V 12

18

16

---------- --------------------------------- ..............

- -------------- --------- -------------- ------

----------------------------------------

--------------- ..................... ..........

---------------------------------------------

---------- ---------

------------------------------------------------- ----------------------------------------------- --------- ------

10

8

)

------------------------------------------------- ---------------------------------------------------------

------------------------------------------------- -------------------------------------------------- ------ ------

6

100 1000 10000

Particles in the simulation box

Figure 4-4. Testing the simulation algorithm for size effects. This test was conducted in conditions similar to the typical simulation experiment. Error bars refer to variations among different runs in the same conditions.

100000

In this case, which corresponds to the typical simulation presented here, the standard deviation among different runs was less than 3%.

4.4 Results and discussion

Figure 4-5 shows a snapshot of a typical simulation at finish time, when all monomers have assembled into fibers. As explained earlier, fibers interact through a Lennard-Jones potential truncated at its minimum, resulting in a soft, purely repulsive interaction. The fibers shown in

Figure 4-5 are therefore entangled, not cross-linked.

122

Figure 4-5. Snapshot of the typical simulation. The simulation box contained 8,000 particles and the average intermolecular distance at initial conditions was 4.5 molecular diameters. In this particular case Da = 10.

The simulation conditions depicted in the figure correspond approximately to a gel formed by a

10 mg/mL peptide solution. This case corresponds to the matrix visualized by TEM in Chapter 1

(Figure 1-2), but not to the experimental conditions explored in this thesis (Chapter 2). The reason for simulating more concentrated systems was to allow feasible computation times.

Kinetics of fiber growth

The evolution of average fiber length over time showed two different regimes. When the

Damkohler number based on nucleation is much smaller than unity (particles react quickly upon contact), the average fiber length increased quickly and monotonically until reaching a plateau

(Figure 4-6, triangles).

123

Kinetics of Fiber Growth

N = 8000, < b / a > = 2.5, Average of 4 runs

12

10 v 8

6

4 -

20

18-

16 --

14 -

2 e

0

0

--

--

2

Da=10

Da=1

Da=10

1

70 80 10 20 30 40 50 60

Dimensionless Time [kTt/( ( )]

Figure 4-6. Time evolution of average fiber length. The simulation box contained 8,000 particles and each data point is the average of 4 different simulation runs in the same conditions. When nucleation time is longer than the typical diffusion time (Da > 1), curves exhibit the typical sigmoidal behavior of processes of nucleation and growth. Error bars are standard deviations among different runs. Here Da refers to the Damkohler number based on nucleation.

90

-

-

In the opposite case, when nucleation times were longer than diffusion times, the average fiber length showed an initial phase of slow growth, followed by fast increase before reaching a plateau. The sigmoidal shape of this second type of behavior (Figure 4-6, circles) is typical of processes of nucleation and growth: when nucleation time is very long, monomers spend a long time diffusing before they can nucleate a fiber (lag time). Once a few fibers are nucleated, most of the monomers in the system add themselves to the formed seeds, resulting in fast increase of average fiber length. The average length reaches a plateau when most of the monomers are consumed. Finally, from Figure 4-6 it is apparent that higher values of the Damkohler number

124

based on nucleation result in longer fibers: when the characteristic time for fiber nucleation becomes much longer than the typical diffusion time, fewer fibers are nucleated, onto which the majority of remaining monomers will bind: the result is a greater average fiber length at equilibrium. In other words, molecules for which fiber nucleation is difficult will tend to generate longer fibers.

Figure 4-7 shows the evolution of fiber length distribution for a simulation in which

Da =0.1.

10000

Evolution of Fiber Length Distribution

N = 8000, Da = 0.1, < b / o> = 2.0

t* = 0.5 t = 2.0

t = 4.0

1000

0

-o

S z

100

10

- --- -- ----

1

0 2 4 6

L/ a

8 10 12

Figure 4-7. Evolution of fiber length distribution over time. This graph shows the distribution of fiber sizes over time from a typical simulation. In this case Da = 0.1: the typical nucleation time is shorter than the typical diffusion time; nucleation is therefore relatively easy and many short fibers are formed. These curves refer to the state of the system close to initial conditions (t* = 0.5), at mid-life (t* = 2.0) and close to the final state (t* = 4.0).

14

125

In this case the system reached equilibrium quickly (approximately four times the time required for a particle to diffuse by one diameter). Since nucleation time was much shorter than diffusion time (easy nucleation), the final configuration was characterized by many short fibers.

Average fiber length

Extensive simulations were performed to explore the shape of the unknown function

(L/-). =(D(Da,,(b/o-))

by systematic variation of the independent variables. Figure 4-8 reports the results of this investigation: each point on the surface is the average of four different simulation runs in the same conditions, where the simulation box contained 8,000 particles. This results refer to the system at equilibrium, when all monomers have self-assembled into fibers.

126

Average Fiber Length

<1L

100

10

1.5

2

-

2.

-- -----

-.-

3 0.001

0.01 Da

Figure 4-8. Average fiber length at equilibrium, when all monomers have self-assembled into fibers. The simulation box contained 8,000 particles and each data point is the average of 4 different simulation runs in the same conditions. Maximum standard deviations were approximately 5%.

Two regimes can be distinguished in this graph: when nucleation time is comparable to or less than diffusion time (Da 1), only short fibers are formed. Essentially, monomers bind almost immediately upon contact with any other monomer, so fibers contains only a few monomers. When the nucleation time becomes larger than diffusion time (Da >1), the average fiber length increases, as explained in the previous section.

127

Average Fiber Length

20

10

0

1.5

2 D a

10

-.-

-------.

4 4.5 0.0.001

01

< b / >.5

Figure 4-9. Average fiber length at the end of self-assembly (same results as in Figure 4-7, with the Z-axis plotted using a linear scale). The simulation box contained 8,000 particles and each data point is the average of 4 different simulation runs in the same conditions. Maximum standard deviations were approximately 5%.

When plotting data in a non-logarithmic scale (in the z-axis) an interesting finding is apparent (Figure 4-9), which could not have been predicted from a mean-field approximation: molecular crowding appears to induce the formation of longer fibers. The effect is most pronounced when the average inter-molecular distance becomes comparable to the diameter of the molecules. In conditions of molecular crowding, where particle diffusion is mostly restricted, the final average fiber length was found to increase. It is possible that, after the first fibers are nucleated, crowding conditions prevent monomers from finding a partner: a fewer number of fibers is therefore nucleated, resulting in larger average length. Another possibility could be fiber alignment: at high concentration fibers are likely to orient themselves parallel to

128

one another in order to reduce the overall conformational entropy. Such alignment would result in a gathering of fiber ends in common areas, where monomers would be more likely to bind. A similar result has also been observed in the context of living polymers simulations [21] and in recent work on amyloid formation. In the latter case, molecular crowding imposed by adding inert polymers was found to accelerate fibrillization of the peptide responsible for Parkinson's disease [35]. Unfortunately the authors did not investigate fiber length specifically, which would have been useful in order to make a comparison with our simulations.

Time to equilibrium

To complement the results presented above, completion times for self-assembly are shown in Figure 4-10. The dependence of completion time on Damkohler number and average inter-molecular distance seems to follow a power-law. In particular, at short intermolecular distances completion time shows a power-law dependence on Damkohler number with exponent approximately .

129

Time to Reach Equilibrium

1000

100

10

1

D

100

2

< b / >

- ----

0 1

0.01 -D

0.001

4 0.0001

Figure 4-10. Time required for self-assembly to reach the final stage (-0.2% of the initial monomers still unbound). The simulation box contained 8,000 particles and each data point is the average of 4 different simulation runs in the same conditions Maximum standard deviations were approximately 10%.

The following scaling argument may help clarify this finding. From the second equation in system (0) (Chapter 3), dF 1 dt 2

2 one can derive the scaling law

F 2

130

where MO is the initial monomer concentration and the infinity subscript indicates values at equilibrium. The average number of fibers can also be expressed as ratio of Mo to the final average fiber length:

M0

Substituting this expression in the previous one and using relation (4) from Chapter 3, one can write:

1

Mo KeK

The Damkohler number based on nucleation contains the factor l/K, (the nucleation rate constant is the reciprocal of the typical nucleation time), so the above scaling relation is consistent with the exponent found from simulations.

Comparison with mean-field predictions

At this point it is important to compare simulation results to the analysis developed in

Chapter 3. For the purpose of this comparison, a version of the code was developed that allowed explicit inclusion of the fiber elongation rate constant K. The Damkohler number based on nucleation was kept constant at 0.1, while the one based on elongation was allowed to vary.

Figure 4-11 shows the results of these experiments.

131

Comparison with Mean-Field Prediction

N=8000, < b />= 2.5

1000

A

8

100

V

10

10 100 1000

Mean-Field

Simulation

Ke /Kn

Figure 4-11. Comparison of simulation results with the mean-field approximation developed in

Chapter 3. The simulation box was 50 molecular diameters in size and the average molecular distance was 2.5 diameters. Bars appearing along with solid circles are standard deviations of fiber length distribution.

10000

The mean field approximation predicts longer fibers than are found from simulations.

This is not unexpected, as that approach is based on the assumption that fiber nucleation and growth are reaction limited. In other words, diffusion times are supposed to be very short compared to reaction times and monomers have instantaneous access to any other monomer or fiber in the system. The simulations developed here, on the other hand, are applicable to situations in which reaction times for fiber elongation are comparable or shorter than diffusion times. In such cases the hypothesis of ideal mixing is no longer valid: once a fiber is nucleated from two free monomers, it will probably capture all surrounding molecules and will create a

132

depletion zone around it. This situation corresponds in practice to conditions of fast gelation: when electrolytes are added to the solution (or the pH is changed) to induce fast assembly, reaction times are greatly reduced, becoming comparable to diffusion times: this is likely to induce non-homogeneities in the system that can affect the final fiber length distribution. The following scaling argument can be used to describe such situations.

The diameter of the typical depletion zone, R-, can be assumed to scale approximately as the average distance traveled by a monomer before binding to a fiber:

R_- Di, where D is the diffusion coefficient of a molecule and t_ the typical time for the system to reach equilibrium. The final average fiber length can then be equated to the number of monomers originally present in a depletion sphere, as they would all be absorbed by the initial seed:

(L)_ ~ MR..

Average fiber length can also be expressed as the ratio of the initial number of monomers to the final number of fibers. Borrowing a result from the previous section, we can then write:

(L)_

4

M 1

F tK,MO

.

Finally, by using the three above expressions, it is possible to deduce:

(L)

3

D 5

1

K 5M

0

5

133

Notice how the elongation rate constant is absent from this expression: in this model the initial nucleation events will completely determine the size of the average fiber. The ratio of equation

(4) in Chapter 3 to the expression above yields a comparison between mean-field and depletionzone predictions:

(L)m.

(L3

(L)DZ

M

D 5

KK

Ke 2n

If K, is kept fixed, for example, the discrepancy between the two predictions widens as K, increases: this result is in line with what observed in figure 4-11. In summary: when reaction times for fiber elongation are comparable or shorter than diffusion times, monomers do not explore the whole system before binding to a fiber. While in conditions of ideal mixing a monomer has access to any other monomer or fiber indifferently, in the present case (due to depletion zones created around fiber seeds) monomers have easier access to other monomers than to fibers. This results in a larger number of nucleated fibers and shorter average length.

4.5

Suggested directions for future research

We believe numerical simulations can contribute significantly to understanding the molecular basis of supramolecular organization. The model presented in this chapter, although very simple as it stands, can be expanded significantly by including a more detailed description of self-assembly. In particular, we wish to suggest two possible avenues of development.

At the molecular level the model can be refined by using a more detailed description of the monomers. Instead of soft spheres, for example, peptides may be represented as a series of beads (amino acids) connected to a common backbone. Moreover, the hydrophobic and

134

hydrophilic character of the amino acids could be represented by appropriate interaction potentials. This would allow the investigation of how molecular shape affects the supramolecular structures formed by self-assembly.

At a macroscopic level, we believe it would be very interesting to investigate the process of network formation. The first step would be to extend the present model by including the attractive part of the Lennard-Jones interaction potential. Depending on temperature, this can result in monomers that aggregate into globular structures (thus facilitating nucleation) and fibers that can attract one another, forming bundles for example. It is likely that temperature and concentration influence the topology of the self-assembled network: it would be interesting to investigate the nature of such dependence.

4.6 Conclusion

In this chapter a computational tool was developed that allows the simulation of peptide self-assembly in conditions where reaction times for fiber elongation become comparable to diffusion times. This corresponds in practice to conditions of fast gelation (induced for example

by pH changes or by addition of electrolytes),. Similarly to what assumed in our mean-field description (Chapter 3), this model is based on the hypothesis that nucleation is the rate-limiting step of the process. Once two peptide monomers pack into a stable $-sheet configuration, this nucleus acts as a template upon which addition of surrounding monomers is facilitated. In reality, globular aggregates may promote nucleation: a possibility that was not included in our model for the sake of simplicity. Molecules are modeled as soft repulsive spheres and their motion described by Langevin's equation with no inertia.

135

As predicted by mean-field theory, the average fiber length was found to be inversely proportional to the nucleation rate constant: when nucleation time is larger than diffusion time, a smaller fraction of monomers nucleate new fibers and the remaining monomers bind to the formed nuclei, resulting in longer fibers.

The average fiber length found using this simulation model was shorter than the one found from mean-field theory. A possible explanation is as follows: when reaction times are comparable to diffusion times, monomer depletion zones form around fiber seeds, making it more difficult for the remaining monomers to reach the reactive sites. Therefore, unlike the case of ideal mixing, monomers can access other monomers more easily than fiber ends (or nuclei), resulting in nucleation of more fibers and a shorter average length. Self-assembly in these conditions induces non-homogeneities that alter its course.

At this point it is useful to recapitulate results from the mean-field approach and from simulations into an organized picture. The governing parameters for fiber self-assembly are the

Damkohler numbers based on nucleation and elongation:

Da =nucleation tdiffusion

Da telongation tdiffusion

When Dae > DanI the time required for fiber elongation is much longer than the typical nucleation time: in other words, nucleation is much easier than elongation. In this case only dimers or very short fibers would be formed. In order to obtain long fibers, it is necessary that

Dae < Da: molecules for which fiber nucleation is difficult compared to elongation will tend to generate longer fibers. Within this second category of phenomena (self-assembly of long fibers),

136

two broad sets of conditions can apply. On one hand, when the typical reaction time for fiber elongation (and therefore nucleation) is much longer than the typical diffusion time, monomers diffuse throughout the system before binding to a fiber and the mixing is ideal; in this case a mean-field approximation can be used to represent the system. On the other hand, when the typical elongation time is shorter than diffusion time, fiber seeds tend to capture all monomers around them, creating depletion zones; the system is not well-mixed any more and a mean-field description would not be appropriate. In this case the computational tool developed in this chapter, along with a simple scaling argument, can be used to predict the behavior of the system.

Figure 4-12 summarizes these considerations. The experiments described in Chapter 2 would pertain to the upper portion of this graph. In particular, when molecules are charged reaction rates are lower than diffusion rates and a mean field approximation is likely to be applicable.

When charges are screened by electrolytes or when molecules are neutral, both

Da, and Da, should decrease by approximately the same amount; in this case simulations should be used to describe self-assembly.

137

Dan k

Simulatio ns

D

(L)K 5

3

~5

V1

7

0M

I

Mean-Field Theory

""

K

K

Dan

o

(1)

Dae

-

o(i)

Dae

Figure 4-12. Summary of self-assembly conditions, as governed by the Damkohler numbers based on nucleation and elongation. When Dae > Da monomers find it easier to nucleate a new fiber than to add themselves to an existing one; therefore only short fibers are formed. In the opposite situation two regimes can be identified, depending on the magnitude of Dae. If

Dae > 1, elongation times are longer than typical diffusion times: monomers can therefore diffuse throughout the system before binding to a fiber, resulting in a well-mixed solution. In this case a mean-field approximation is applicable. In the opposite case, non-homogeneities are introduced into the system by the self-assembly process itself (depletion zones): simulation tools are most applicable in this case. The conditions investigated experimentally would pertain to the upper portion of this graph. When molecules are charged, as in most of the experiments reported in Chapter 2, reaction rates are lower than diffusion rates and a mean field approach is likely to be applicable. If charges are screened by electrolytes (or absent due to pH), both Dan and Dae are expected to decrease by approximately the same amount; in this case simulations should be used to describe self-assembly.

138

References for Chapter 4

[29] Serpell, L. C. Alzheimer's amyloid fibrils: structure and assembly. Biochim.

Biophys. Acta 1502, 16 (2000).

[30] Koo, E. H., Lansbury, P. T. Jr. and Kelly, J. W. Amyloid diseases: abnormal protein aggregation in neurodegeneration. Proc. Natl. Acad. Sci. USA 96, 9989

(1999).

[31] Lansbury, P. T. Jr. Evolution of amyloid: What normal protein folding may tell us about fibrillogenesis and disease. Proc. Natl. Acad. Sci. USA 96, 3342 (1999).

[32] Dobson, C. M. The structural basis of protein folding and its links with human disease. Phil. Trans. R. Soc. Lond. B 356, 133-145 (2001).

[33] Perutz, M. F., Pope, B. J., Owen, D., Wanker. E. E. and Scherzinger, E.

Aggregation of proteins with expanded glutamine and alanine repeats of the glutamine-rich and asparagine-rich domains of Sup35 and of the amyloid f-peptide of amyloid plaques. Proc. Natl. Acad. Sci. USA 99, 5596 (2002).

[34] Finkelstein, A. V. Rate of f-structure formation in polypeptides. PROTEINS:

Structure, Function and Genetics 9, 23 (1991)

[35] Muiloz, V. et. al. A statistical mechanical model for 0-hairpin kinetics. Proc. Natl.

Acad. Sci. USA 95, 5872 (1998).

[36] Pande, V. S. et. al. Molecular dynamics simulations of unfolding and refolding of a

-hairpin fragment of protein G. Proc. Natl. Acad. Sci. USA 96, 9062 (1999).

[37] Ferrara, P. and Caflisch, A. Folding simulations of a three-stranded antiparallel sheet peptide. Proc. Natl. Acad. Sci. USA 97, 10780 (2000).

-

[38] Aurora, R., Creamer, T. P., Srinivasan, R. and Rose, G. D. Local interactions in protein folding: lessons from the o-helix. J. Biol. Chem. 272, 1413 (1997).

[39] Yang, A. S. and Honig, B. Free energy determinants of secondary structure formation: II. Antiparallel P-sheets. J. Mol. Biol. 252, 366 (1995).

[40] Pauling, L., Corey, R. B. and Branson, H. R. The structure of proteins: two hydrogen-bonded helical configurations of the polypeptide chain. Proc. Natl. Acad.

Sci. USA 37, 205 (1951).

[41] Pauling, L. and Corey, R. B. Configurations of polypeptide chains with favored orientations around single bonds: two new pleated sheets. Proc. Natl. Acad. Sci.

USA 37, 729 (1951).

139

[42] Dinner, A. R., Lazaridis, T. and Karplus, M. Understanding f-hairpin formation.

Proc. Nati. Acad. Sci. USA 96, 9068 (1999).

[43] Bitan, G., Kirkitadze, M., Lomakin, A., Vollers, S. S., Benedek, G. B. and Teplow,

D. B. Amyloid $-protein (AP) assembly: A$40 and AP42 oligomerize through distinct pathways. Proc. Natl. Acad. Sci. USA 100, 330 (2003).

[44] Roher, A. E., Baudry, J., Chaney, M. 0., Kuo, Y. M., Stine, W. B. and Emmerling,

M. R. Oligomerization and fibril assembly of the amyloid 1-protein. Biochim.

Biophys. Acta 1502, 31 (2000).

[45] Hasegawa, K., Ono, K., Yamada, M. and Naiki, H. Kinetic modeling and determination of reaction constants of Alzheimer's $-amyloid fibril extension and dissociation using surface plasmon resonance. Biochemistry 41, 13489 (2002).

[46] Rouault, Y. Off-lattice Brownian dynamics simulation of wormlike micelles: the dependence of the mean contour length on concentration. J. Chem. Phys 111, 9859

(1999).

[47] Milchev, A., Wittmer, J. P. and Landau, D. P. Dynamical Monte Carlo study of equilibrium polymers: effects of high density and ring formation. Phys. Rev. E 61,

2959 (2000).

[48] Rouault, Y. and Milchev, A. Monte Carlo study of the molecular-weight distribution of living polymers. Phys. Rev. E 55, 2020 (1997).

[49] Rouault, Y. Concentration-induced growth of polymerlike micelles. Phys. Rev. E

58, 6155 (1998).

[50] Menon, G. I. and Pandit, R. Glass formation in a lattice model for living polymers.

Phys. Rev. Lett. 75, 4638 (1995).

[51] Agarwal, U. S. and Khakhar, D. V. Simulation of diffusion-limited step-growth polymerization in 2D: effect of shear flow and chain rigidity. J. Chem. Phys 99,

3067 (1993).

[52] Vogt, M. and Hernandez, R. A two-dimensional polymer growth model. J. Chem.

Phys 115, 1575 (2001).

[53] Vogt, M. and Hernandez, R. A three-dimensional polymer growth model. J. Chem.

Phys 116, 10485 (2002).

[54] Akkermans, R. L C., Toxvaerd, S. and Briels, W. J. Molecular dynamics of polymer growth. J. Chem. Phys 109, 2929 (1998).

[55] Damk6hler, G. In A. Eucken and M. Jakob's Der Chemie-Ingenieur, Vol. III.

Akademische Verlagsgesellschaft, Leipzig (1939).

140

[56] Reif, F. Funamentals of statistical and thermal physics. McGraw-Hill (1965)

[57] Doyle, P. S., Shaqfeh, E. S. G. and Gast, A. P. Rheology of "wet" polymer brushes via Brownian dynamics simulation: steady vs. oscillatory shear. Phys. Rev. Lett. 78,

1182 (1997).

[58] Weeks, J. D., Chandler, D. and Anderson, H. J. Chem. Phys 93, 3515 (1990).

[59] Rapaport, D. C. The art of molecular dynamics simulation, Cambridge University

Press (1997).

[60] Rouault, Y. The effect of stiffness in wormlike micelles. Eur. Phys. J. B 6, 75

(1998).

[61] Allen, M. P. and Tildesley, D. J. Computer simulation of liquids. Oxford

University Press, Oxford (1993).

[62] Branka, A. C. and Heyes, D. M. Algorithms for Brownian dynamics computer simulations: Multivariable case. Phys. Rev. E 60, 2381 (1999).

[63] Shtilerman, M. D., Ding, T. T. and Lasbury, P. T. Jr. Molecular crowding accelerates fibrillization of x-synuclein: could an increase in the cytoplasmic protein concentration induce Parkinson's disease? Biochemistry 41, 3855 (2002)

141

142

CHAPTER

5

CONCLUSIONS

Understanding the molecular basis of self-assembly has enormous potential for applications in engineering. This thesis has investigated the process through which a class of designed peptides self-assembles in aqueous solution to form networks of nanoscale fibers.

Elucidating the relation between molecular structure and supramolecular organization was the primary motivation for this work. The first step in this direction was the characterization of the fibers self-assembled from such peptides and the mechanism of their formation. The results of this investigation were interpreted by means of simple analytical and numerical models of nucleation and growth of n-sheet fibers. As a future research direction, we suggest that these models be refined in synergy with experiments and used as frameworks for asking further questions and for designing peptide-based biomaterials.

Chapter 2 presented the results of experiments aimed at characterizing the self-assembly process through which n-sheet peptides form nanoscale fibers. Using an array of investigation techniques chosen to minimize sample processing, the structure of such fibers and the process of their formation was elucidated with nanometer scale resolution. The most unexpected result was the complexity of the process leading to the formation of hydrogel matrices: a population of identical peptide molecules in aqueous solution can give rise to a wide variety of coexisting structures evolving over time. In the case of the KFE8 molecule, for example, mature fibers are formed through intermediate steps in which the fiber structure (a helical ribbon) differs markedly from its final form (tubular). A mature peptide hydrogel is a network of entangled fibers, each approximately 10 nm in diameter and microns in length. Small variations in amino acid

143

sequence were found to have substantial effects on the final fiber structure and self-assembly mechanism. The molecule KWE8 for example, obtained by substituting phenylalanine in KFE8 with tryptophan, appeared to form tapes instead of tubules. Even though the resolution achieved in these experiments was higher than what had been obtained before, it was not enough to definitely resolve the molecular packing architecture of such fibers. Instead, a possible packing model was proposed, using molecular dynamics simulations, by selecting the most stable structure compatible with experimental dimensions. In order to learn more about the process of nucleation and growth of -sheet fibers, we believe it is necessary to operate in a very clean and simple setting. In our opinion, investigating the process at low peptide concentration on a twodimensional surface (as in some of the experiments described in Chapter 2) is probably the best setting to achieve this goal. In particular, it should be possible to apply the analytical and numerical tools developed here to design such experiments, interpret their results and refine those models. As for the molecular architecture of the fibers, experimental techniques other than atomic force microscopy are required. X-ray diffraction studies could be performed, provided that one succeeds in aligning the fibers: deposition on graphite, as reported in Figure 2-25, may be a way of achieving this goal. The kinetics of self-assembly could also be investigated at small time scales by means of UV absorption spectroscopy. In order to elucidate the transition from helical ribbons to tubules in the case of KFE8, it would be useful to monitor the evolution of pitch and radius of such helical ribbons over time. It would also be very interesting to enquire whether physical or chemical conditions exist that can promote or impede such transition.

Several applications can also be suggested for this system. In the case of the KFE8 peptide for example, if it were possible to chemically cross-link the helical ribbon intermediates, one would be able to produce nanometer-scale springs. A material comprised of a network of such springs

144

may have unusual mechanical properties. By attaching a metal nanocrystal to each peptide molecule, self-assembling nanowires can also be created; their self-assembly being dependent on environmental conditions, such wires could find application as chemical sensors.

In Chapter 3 an analytical description of fiber nucleation and growth was developed, based on a Smoluchowski-type mean-field approximation, that allowed to derive a relation between average fiber length and two rate constants assumed to be fundamental in the process.

The model is very simple as it stands: n-sheet fiber self-assembly is assumed to be a two-step process and the possible role of micelles in nucleation is neglected. Despite its simplicity, the model can be used to gain quantitative insight into the process and to inspire further experimental investigations. There are several ways of refining this model, such as: allowing for a multi-step nucleation process, including micelle-mediated nucleation and describing the details of P-sheet formation (for example through a model similar to the ones used to describe DNA unzipping).

Chapter 4 presented a computational tool to simulate self-assembly in conditions that a mean-field approximation cannot capture: such model is suitable for investigating fast gelation

(induced for example by electrolytes addition or pH changes), where reaction times become comparable to diffusion times. This model is based on a simple description of peptide molecules diffusing in a three-dimensional medium and allows simulation of all stages of self-assembly: from uniformly distributed monomers to the formed network. The dependence of average fiber length on monomer concentration and nucleation rate was investigated through systematic variation of these parameters. The average fiber length was found to be shorter than that predicted by a mean-field approximation. A scaling argument was developed to explain such discrepancy based on the idea that, when diffusion and reaction times are comparable, self-

145

assembly introduces non-homogeneities in the system (depletion zones around the initial fiber seeds) that affect average fiber length.

Several interesting avenues for future development exist in this area. The code developed here is based on purely repulsive molecules; one can easily change that assumption and investigate the effect, for example, of Van der Waals attraction between molecules. When molecules attract one another (and depending on how interaction energy compares to thermal energy) the formed fibers can interact to form a stable network. It would be very interesting to elucidate which factors determine the structure of the formed network. Another interesting area of development would emerge by using a more realistic molecular structure. The simulation code as it stands describes molecules as soft repulsive spheres. It would be very interesting to elucidate the extent to which a more detailed molecular model affects the kinetics of fiber nucleation and growth, as well as the structure of the formed fibers. This will lead toward a more general area of investigation: understanding how the details of molecular structure and interaction affect the architecture of self-organized materials.

146

APPENDIX

THE SIMULATION CODE

The code is composed of five files: main.cpp, header.h, init.cpp, forces.cpp and rng.cpp, whose function is described below. The majority of the algorithms used were adapted from the book by Rapaport (Rapaport, D. C. The art of molecular dynamics simulation, Cambridge

University Press (1997)). A script is also included (disp.vmd) that allows the visualization of results using the freely available software VMD (developed by Klaus Schulten and coworkers at the University of Illinois at Urbana Champaign). The details of the simulation model are described in Chapter 4.

Header file (header.h)

This file contains the definition of all external variables and function prototypes.

/ ********************************************************************

/ * POSITIONAL BROWNIAN DYNAMICS (no inertia)

/ * TRUNCATED LENNARD-JONES REPULSIVE POTENTIAL

// * Monomers diffuse by Brownian dynamics and, upon contact with one *

*

// * another or with the end of a formed chain, they form a bond. *

// * Equations of motion written in non-dimensional form with units *

// *

//

* sigma (particle diameter), kT and zeta (drag coeff): *

*

// *

// * dr/dt = 48 * lambda * (Sum of Forces) + (Brownian Force) *

*

*

//

//

*

* lambda = epsilon/kT

June 26, 2003 *

*

/ ********************************************************************

// * To compile: g++ -03 main.cpp rng.cpp forces.cpp init.cpp -o main *

// * To visualize simulations, type "source disp.vmd" in VMD console *

/i ********************************************************************

#include <stdio.h>

#include <math.h>

#include <time.h>

#include <stdlib.h>

#define NDIM 3 // This code works only in 3D

147

// **************************

#define nAtom

#define bondLim

#define bendStiff

#define lambda

#define K

#define gap

#define deltaT

#define stepCoord

#define analysis

#define outFile

#define sumFile

#define dataFile

#define runs

#define seed

USER-DEFINED PARAMETERS ********************

8000

2.2

3000

1.0

// Number of particles (must be cube)

// Bond length limit (typical value is 2.24)

// Chain bending stiffness

// Epsilon/kT

3.0

100000

1000

// Nucleation rate constant

// Distance between monomers in a cubic latt ice

// A timestep of 1 corresponds to 20 ns

// Frequency of coordinates output

// Frequency of data analysis

"coord.pdb

"summary.o

ut"

"data.dat"

4

30

// Num ber of runs

// * FUNCTION PROTOTYPES *********************** void SetupRun (void), AllocArrays (void), InitStep (void), SingleStep (void); void ComputeForces (void), ComputeBondForces (void); void ComputeBrownianForces (void); void ComputeBendingForces (void), MoveParticles(void); void ApplyBoundaryCond (void), InitCoords (void), PrintCoord (void); void CloseJob (void); void UpdateChains (int, int), AnalyzeChains (void); void initgenrand(unsigned long s); double SignR (double, double), Sqr (double), DotProd (double *x, double *y); double genrandreal(void);

// ******************** GLOBAL

VARIABLES

********************************** extern double **r, **force, **totDisp; // Positions, forces, displacements extern double region, regionH, rCut, rrCut, timeNow, initTime, cellWid; extern int stepCount, monomers, cells, atomsPerEdge, diffuseTimeStep; extern int *cellList, **chains; extern FILE *fpCoord, *fpSum, *fpData;

148

Main code (main.cpp)

The code is organized around this main file, which calls all other subroutines.

// * MAIN CODE

// * June 16, 2002 *

*

/ ******************

#include "header.h"

// ******************** GLOBAL

VARIABLES * double **r, **force, **totDisp; // Positions, displacements double region, regionH, rCut, rrCut, timeNow, initTime, cellWid; int stepCount, monomers, cells, atomsPerEdge, diffuseTimeStep; int *cellList, **chains;

FILE *fpCoord, *fpSum, *fpData; int main (void)

{ int i; initgenrand (seed); // Initializes the seed for RNG for (i = 0; i <

AllocArrays (;

SetupRun (); runs; i ++) {

// Run the simulation until a fraction of the initial monomers is left.

// This condition was chosen because the finish time was highly variable,

// due to the last few monomers in solution waiting to bind to fibers.

while (monomers > 0.005 * nAtom)

SingleStep (;

}

CloseJob (;

} return 0; void SetupRun (void)

{ int k, n; stepCount = 0; monomers = nAtom;

// Total displacements set to zero for (n = 0; n < nAtom; n ++) for (k = 0; k < NDIM; k ++) totDisp[k] [n] = 0.;

InitCoords (; i void SingleStep

(void)

149

{

InitStep (;

ComputeForces (;

ComputeBondForces (;

ComputeBendingForces (;

ComputeBrownianForces (;

MoveParticles ();

ApplyBoundaryCond (;

} if (stepCount % stepCoord == 0)

PrintCoord (); if ((stepCount > diffuseTimeStep && stepCount % analysis stepCount == diffuseTimeStep| monomers

AnalyzeChains (;

0)

0) void InitStep (void)

{ int n, k; stepCount ++; timeNow = stepCount * deltaT;

// All forces set to zero for (n = 0; n < nAtom; n ++) for (k = 0; k < NDIM; k ++) force[k] [n] = 0.;

} void MoveParticles (void)

{ int k, n; for (n = 0; n < nAtom; n ++) for (k = 0; k < NDIM; k ++) {

ri[k] [n] += force [k] [n] * deltaT;

// Update total displacements if (stepCount > diffuseTimeStep) totDisp[k][n] += force[k][n] * deltaT;

}

} void ApplyBoundaryCond (void)

{ int k, n; for (n = 0; n < nAtom; n ++) { for (k = 0; k < NDIM; k ++) { if (r[k] [n] >= region) r [k] [n] = r [k] [n] region; else if (r[k] [n] < 0.) r[k] [n] = r[k] [n] + region;

}

}

}

150

// Prints coordinates in PDB format void PrintCoord (void)

{ int n; fpCoord = fopen (outFile, "a"); for (n = 0; n < nAtom; n ++) { fprintf (fpCoord, "ATOM %5d%5c%14c", n, 'C', ' fprintf (fpCoord, "%8.3f%8.3f%8.3f\n", r[0][n] ,r[l][n], r[2][n]);

} fprintf (fpCoord, "ENDMDL\n"); fclose (fpCoord);

} void AnalyzeChains (void)

{ int k, 1, n, len, tot, last, current, next, totFibers, chainLen[nAtom + 2]; double avg, stdev, sqrDisp; for (n = 0; n < nAtom + 2; n ++) chainLen[n] = 0; totFibers = 0; avg = 0.; stdev = 0.; sqrDisp = 0.; for (n = 0; n < nAtom; n ++) { if (chains[n][0] < 0) chainLen[l] ++; // This is a monomer if (chains[n][0] >= 0 && chains[n][1] < 0) { // This is a chain end len = 2; // If two monomers meet, min length of chain is 2 last = n; current = chains[n][0]; while (chains[current][1] >= 0) { if (chains[current][0] == last) next = chains[current][1]; else next = chains[current][0]; last = current; current = next; len ++;

} chainLen[len] ++;

} for (1 = 2; 1 <= nAtom; 1 ++) chainLen[l] = chainLen[l] / 2; // Adjust for double counting of chains

//from algorithm above

// Checks that total atoms in the chains + monomers equals

// the initial number of monomers tot = 0; for (1 = 1; 1 <= nAtom; 1 ++) tot tot + chainLen[l] * 1; if (tot != nAtom) printf ("NO\n");

// The array is (nAtom + 2) long

// in case only one fiber forms with all monomers

1

= nAtom + 1; while (chainLen[l] == 0) if (chainLen[l 1] == 0) 1 else chainLen[l] = -1; // Put flag at end of useful information in array

151

fpSum = fopen (sumFile, "a"); fprintf (fpSum, "# Length Number

1 = 1; while (chainLen[l] -1) {

Time:%8.3f\n", timeNow initTime); fprintf (fpSum, "%8d %8d\n", 1, chainLen[l]);

1

++;

}

// Compute average length and standard deviation

1

= 1; while (chainLen[l] -1) { totFibers += chainLen[l]; avg += 1 * chainLen[l]; stdev +=

1 ++;

1 * 1 * chainLen[l];

} avg = avg / totFibers; stdev = stdev / totFibers; stdev -= Sqr (avg); stdev = sqrt (stdev); fprintf (fpSum, "# Avg Len:%7.2f, Std Dev:%7.2f\n\n\n", avg, stdev); fclose (fpSum);

// Compute Mean Square Displacement for (n = 0; n < nAtom; n ++) for (k = 0; k < NDIM; k ++) sqrDisp += Sqr (totDisp[k][n]); sqrDisp = sqrDisp / nAtom;

// Write data on a file readable by Gnuplot fpData = fopen (dataFile, "a"); fprintf (fpSum, "%8.3f %8.3f %8.3f %12.2f\n", timeNow initTime, avg, stdev, sqrDisp); fclose (fpData);

} void CloseJob (void)

{ int 1; long now; now = time(NULL); fpCoord = fopen (outFile, "a"); fprintf (fpCoord, "REMARK %s \n", ctime(&now)); fclose (fpCoord); fpSum = fopen (sumFile, "a"); fprintf (fpSum, "REMARK %s \n", ctime(&now)); fclose (fpSum);

} fpData = fopen (dataFile, "a"); fprintf (fpData, "# %s \n", ctime(&now)); fclose (fpData);

152

double SignR (double x, double y)

{ if (y >= 0.) return (x); else return (-x);

} double Sqr (double x)

} return (x * x); double DotProd (double *x, double *y)

{ return (x[0]l * y[O] + x[l] * y[1] + x[2] * y[2 );

}

153

Initialization file (init.cpp)

This file initializes all arrays and output files. The array chains is particularly important, as it stores information about the bonds that can form between monomers. The main idea is that each monomer is seen as having two unsatisfied bonds that can be used to link it to other monomers and form linear chains. The status of such bonds is stored in the array chains, as explained below.

// * INITIALIZATION *

// * June 26, 2003 *

/ ******************

#include "header.h" void AllocArrays (void) // Initialize arrays and coordinate file

{ int k, n; long now; now = time(NULL); r = new double*[NDIM]; // Coordinates for (k = 0; k < NDIM; k ++) ri[k] = new double[nAtom]; for (n = 0; n < nAtom; n ++) for (k = 0; k < NDIM; k ++) r[k][n] = 0.; // Initialize array to zero force new double*[NDIM]; // Displacements for (k 0; k < NDIM; k ++) force[k] = new double[nAtom]; for (n = 0; n < nAtom; n ++) for (k = 0; k < NDIM; k ++) force [k] [n] = 0.; // Initialize array to zero totDisp = new double*[NDIM]; // Total displacements from start for (k = 0; k < NDIM; k ++) totDisp[k] = new double[nAtom]; for (n = 0; n < nAtom; n ++) for (k = 0; k < NDIM; k ++) totDisp[k] [n] = 0.; // Initialize array to zero

154

// The matrix 'chains' contains information about fibers topology.

// Each atom (or monomer) is modeled as having two unsatisfied bonds

// that can be used to link it to other atoms to form linear chains.

// Satisfied bonds are represented as the number of the atom to which

// the current one is bound. Unsatisfied bonds are represented as 1.

// The first bond to be satisfied is ALWAYS stored in chains[atom #1[0].

// Therefore, if an atom has only one unsatisfied bond, it is a chain end

// and will have a value in chains[atom #1[0] greater than -1.

// If an atom has two satisfied bonds, then it belongs to the interior

// of a chain. This information will be used when computing chain lengths

// and also for imposing bending stiffness: the algorithm goes through each

// atom and checks whether it has two satisfied bonds; in this case, it

// will straighten the angle between these 3 atoms.

chains = new int*[nAtom]; for (n = 0; n < nAtom; n ++) chains[n] new int[2]; atomsPerEdge (int) (pow (nAtom, 1. / 3.) + 0.5);

// 0.5 added for rounding to integer region (double) gap * atomsPerEdge; // Edge size of the simulation box regionH = 0.5 * region;

// Time steps for particles to diffuse randomly by half their initial gap diffuseTimeStep = (int) (gap * gap / (24. * deltaT)); initTime = diffuseTimeStep * deltaT; // Time when monomers are allowed

// to bind rCut = pow (2., 1./6.); // Minimum of L-J potential rrCut = Sqr (rCut); cells = (int) (region / rCut); // Cells in each edge of box.

if (cells == 1) printf("WARNING! One cell only: the code will not work."); cellWid region / cells; // Cell width cellList = new int[nAtom + cells * cells * cells]; // Which atoms are

// in which cell (3D)

// Initialize coordinate file with header info fpCoord = fopen (outFile, "a"); fprintf (fpCoord, "REMARK Region Monomers Gap DeltaT "); fprintf (fpCoord, "BendStiff BondLim ReactRate Runs Seed\n"); fprintf (fpCoord, "REMARK %6.lf%10d%5.lf%9.le%lld%9.2f%ll.le%6d%6d\n", region, nAtom, gap, deltaT, bendStiff, bondLim, K, runs, seed); fprintf (fpCoord, "REMARK fprintf (fpCoord, "REMARK\n"); fclose (fpCoord);

%s", ctime(&now));

//Initialize summary file with info fpSum = fopen (sumFile, "a"); fprintf (fpSum, "REMARK Region Monomers Gap DeltaT "); fprintf (fpSum, "BendStiff BondLim ReactRate Runs Seed\n"); fprintf (fpSum, "REMARK %6.lf%10d%5.lf%9.le%lld%9.2f%ll.le%6d%6d\n", region, nAtom, gap, deltaT, bendStiff, bondLim, K, runs, seed); fprintf (fpSum, "REMARK %s", ctime(&now)); fprintf (fpSum, "REMARK\n");

155

fprintf (fpSum, "REMARK **************** Fiber Length Distribution fclose (fpSum);

}

// Initialize data file with info fpData = fopen (dataFile, "a"); fprintf (fpData, "# Region Monomers Gap DeltaT "); fprintf (fpData, "BendStiff BondLim ReactRate Runs Seed\n"); fprintf (fpData, "# %6.lf%10d%5.lf%9.le%lld%9.2f%li.le%6d%6d\n", region, nAtom, gap, deltaT, bendStiff, bondLim, K, runs, seed); fprintf (fpData, "# %s", ctime(&now)); fprintf (fpData, "#\n"); fprintf (fpData, "# Time AvgLen StdDev fclose (fpData);

MSD\n"); void InitCoords (void)

{ int nX, nY, nZ, n, k;

// Distribute atoms in a cubic lattice for (nZ = 0; nZ < atomsPerEdge; nZ ++) for (nY = 0; nY < atomsPerEdge; nY ++) for (nX = 0; nX < atomsPerEdge; nX ++) { n = (nZ * atomsPerEdge + nY) * atomsPerEdge + nX; // Atom currently being placed r[] [n] = (nX + r[l] [n] = (nY + r[2][n] = (nZ +

}

0.5)

0.5)

*

* gap; gap;

0.5) * gap;

}

// Initialize chain matrix for (n = 0; for (k = n

0;

< k nAtom; n

< 2; k ++)

++) chains[n] [k] = 1;

156

Force computation file (forces.cpp)

Distances between particles are checked using the cell subdivision method.

/ **********************

// * FORCE COMPUTATIONS *

// * June 6, 2003 *

/ **********************

#include "header.h" void ComputeForces (void)

{ double dr[NDIM], shift[NDIM], f, fcVal, rr, rri, rri3; int c, j1, j2, k, ml, m1X, m1Y, m1Z, m2, m2X, m2Y, m2Z, n, int iofX[] {0,,l,0,-l,0,ll,0,-l,-1,-l,0,l}; int iofY[] ={0,0,lll,0,0,l,,l,0,-l,-l,-l}; int iofZ[] = {0,0,0,0,0,1,1,1,1,1,1,1,1,1}; offset;

// Assigns particles to cells for (n = nAtom; n < nAtom + for (n = cells * cells * cells; n ++) cellList[n] = -1;

0; n < nAtom; n ++) {

+ (int) (r[1] [n] / cellWid)) * cells c = ((int)

+ (int)

(r[2] [n] / cellWid) * cells

(r[0] [n] / cellWid) + nAtom; cellList[n] = cellList[c]; cellList[c] = n;

} for (m1Z = 1; m1Z <= cells; m1Z ++) for (mlY = 1; m1Y <= cells; m1Y ++)

{ // m1Z is position of first cell

// in Z-direction

{ // m1Y is position of first cell for (mlX = 1; m1X <= cells; miX ++)

// in Y-direction

{ ml = ((mlZ 1) * cells + m1Y 1) * cells + m1X + nAtom 1;

// ml=first cell in cellList for (offset = 0; offset < 14; offset ++) { m2X = mIX + iofX[offset]; shift[0] = 0.;

// m2X is position of second cell in X-dir if (m2X > cells) { m2X = 1; shift[0] = region;

} else if (m2X == 0)

{ m2X = cells; shift[0] = region;

} m2Y = m1Y + iofY[offset]; shift[l] 0.; if (m2Y > cells) { m2Y = 1; shift[l] =

} else if (m2Y == 0) { region; m2Y = cells; shift[l] = region;

} m2Z = m1Z + iofZ[offset]; shift[2] = 0.; if (m2Z > cells)

{ m2Z = 1; shift[2] = region;

157

} else if (m2Z == 0) { m2Z = cells; shift[2] = -region;

} m2 = ((m2Z 1) * j1 = cellList[ml]; cells + m2Y 1) * cells + m2X + nAtom 1; while (jl > 1) { j2 = cellList[m2]; while (j2 > 1) { if (ml m2 11 j2 < j1) { for (k dr[k] =

0; k < NDIM; k ++) r[k] [jl] r[k][j21 shift[k]; rr = DotProd (dr, dr);

// If particles' distance less than repulsion range (min L-J),

// compute force if (rr < rrCut) { rri = 1. / rr; rri3 = rri * rri * rri; fcVal 48. * lambda * rri3 * (rri3 for (k = 0; k < NDIM; k ++) { f = fcVal * dr[k]; force[k][jl] += f; force[k][j2] f;

0.5) * rri;

}

}

}

}

}

} if (stepCount > diffuseTimeStep)

UpdateChains (j1, j2); // Particles might link

}

} j2 = cellList[j2];

} jI = cellList[j];

// This function is called when 2 particles come within interaction range

// The matrix is updated so that the bond in the first position

// for each particle is always satisfied first. This way,

// if particle j is such that chains[j][0] = -1, then it is definitely a

// monomer, without need to check chains[j][1].

void UpdateChains (int j1, int j2)

{ double rdm, drl[3], dr2[3], cosAngle; int k, j3;

// Dimer formation probability

// Use the function expml() because it is accurate when argument is small static double pdimer = expml (- 1. * K * deltaT);

// If both atoms have unsatisfied bonds, link them with probability pdimer if (chains[j1][0] < 0 && chains[j2][0] < 0) { rdm = genrand-real (; if (rdm < p-dimer) { chains[jll [01 = j2;

158

} chains[j2][0] = ji; monomers -= 2;

// If one of them is a monomer and the other is a chain end,

// link them upon condition that the monomer lies within a

// certain angle from the fiber axis.

else if (chains[jl][0] < 0 && chains[j2][1] < 0) { j3 = chains[j2][0]; // Monomer next to the fiber for (k = 0; k < NDIM; k ++) { drl[k] = r[k] [j2] r[k] [j3]; if (fabs (dri[k]) > regionH) dri[k] = drl[k] dr2[k] = r[k] [ji] r[k] [j2]; if (fabs (dr2[k]) > regionH) dr2[k] = dr2[k] -

} cosAngle = DotProd (drl, dr2) / sqrt tip

SignR (region, drl[k]);

SignR (region, dr2[k]);

(DotProd (drl, dri) * DotProd (dr2, dr2)); if (cosAngle > 0.7) { chains[jl] [0] = j2; chains[j2] [1] = jl; monomers -- ;

}

}

// Same as above, symmetric situation.

else if (chains[jl] [1] < 0 && chains[j2] [0] < 0) { j3 = chains[jl][0]; // Monomer next to the fiber for (k = 0; k < NDIM; k ++) { drl[k] = r[k] [j] r[k] [j3]; if (fabs (drl[k]) > regionH) drl[k] = drl[k] dr2[k] = r[k] [j2] r[k] [jl]; if (fabs (dr2[k]) > regionH) dr2[k] = dr2[k] tip

SignR (region, drl[k]);

SignR (region, dr2[k]); cosAngle = DotProd (dri, dr2) dr2)); if (cosAngle > 0.7) { chains[jl] [1] = j2; chains[j2] [0] = jl; monomers

}

/ sqrt (DotProd (dri, drl) * DotProd (dr2,

}

// This function scans through the matrix 'chains' to look for bonds

// and computes the force required. To avoid double counting,

// only bonds with higher-numbered atoms are computed.

void ComputeBondForces (void)

{ double dr[NDIM], f, fcVal, rr, rri, int jl, j2, k, 1; for (ji = for (1 =

0; jl < nAtom; ji ++)

0; 1 < 2; 1 ++) if (chains[jl][1] > jl) { j2 = chains[jl] [1]; for (k = 0; k < NDIM; k ++) { rri3, w;

159

dr[k] = r[k] [ji1] r[k] [j2] if (fabs (dr[k]) > regionH) dr[k] = dr[k] SignR (region, dr[k]);

} rr DotProd (dr, dr); w 1. bondLim / sqrt (rr); if (w > 0.) printf ("Bond snapped!

rr = rr * Sqr (w); if (rr < rrCut) { rri = 1. / rr; rri3 = rri * rri * rri; fcVal 48. * lambda * w * rri3 for (k 0; k < NDIM; k ++) { f = fcVal * dr[k]; force[k][jl] += f; force[k] [j2] -= f;

}

*

"1);

(rri3 -

}

}

0.5) * rri;

}

// Adds bending stiffness to chains of particles.

// If both entries in chains[j][0..1] are positive, it means that

// both bonds of particle j are satisfied => j is in the middle

// of two other particles. Bending stiffness must be applied to straighten

// the angle between these 3 particles.

// By scanning all particles whose row in the matrix 'chains' has positive

// entries, bending stiffness is applied where is needed.

void ComputeBendingForces (void)

{ double drl[3], dr2[3], c, cD, c1l, c12, int k, n, n1, n2; c22, f, fl, f2; for (n = 0; n < nAtom; n ++) if (chains [n] [0] > -1 && chains [n] [1] n1 = chains[n] [0]; n2 = chains[n] [1]; for (k = 0; k < NDIM; k ++) {

> -1) { drl[k] = r [k] [n] r [k] [n1]; if (fabs (drl[k]) > regionH) drl[k] dr2[k] = r[k][n2] r[k][n]; if (fabs (dr2[k]) > regionH) dr2[k]

} c1l = DotProd (drl, drl); c12 = DotProd (drl, dr2); c22 = DotProd (dr2, dr2);

= drl[k] SignR (region, drl[k]);

= dr2[k] cD = sqrt (c11 * c22); c c12 / cD; f = (double) (- bendStiff * (c 1.0)); for (k 0; k < NDIM; k ++) { fl = f * ((c12 / c1l) * dri[k] dr2[k]) / cD; f2 = f * (drl[k] (c12 / c22) * dr2[k]) / cD; force[k][nl] += fl;

force[k] [n] -= (fl + f2); force[k][n2] += f2;

SignR (region, dr2[k]);

}

}

}

160

void ComputeBrownianForces (void)

{ static double maxBrownianForce = sqrt (6.0 / deltaT); int n, k; double rdm;

} for (n = 0; n < nAtom; n ++) for (k = 0; k < NDIM; k ++) { rdm = genrand_real (); force[k] [n] += (2. * rdm 1.) * maxBrownianForce;

}

161

Random number generator (rng.cpp)

A random number generator with a very long period was chosen, so that only one seed initialization was necessary at the beginning of a sequence of runs in the same conditions.

/

*********************************************

// * Marsenne Twister Random Number Generator *

// * Period = 10 ^ 500 *

// * January 16, 2003 *

/ *********************************************

#include "header.h"

/*

Coded by Takuji Nishimura and Makoto Matsumoto.

This is a faster version by taking Shawn Cokus's optimization,

Matthe Bellew's simplification, Isaku Wada's real version.

Before using, initialize the state by using init-genrand(seed)

Copyright (C) 1997 2002, Makoto Matsumoto and Takuji Nishimura,

All rights reserved.

*/

Any feedback is very welcome.

http://www.math.keio.ac.jp/matumoto/emt.html

email: matumoto@math.keio.ac.jp

/* Period parameters */

#define N 624

#define M 397

#define MATRIXA Ox9908bOdfUL /* constant vector a */

#define UMASK Ox80000000UL /* most significant w-r bits */

#define LMASK Ox7fffffffUL /* least significant r bits */

#define MIXBITS(u,v) ( ((u) & UMASK)

I

((v) & LMASK) )

#define TWIST(u,v) ((MIXBITS(u,v)

>> 1)

A

((v)&lUL ? MATRIXA : OUL)) static unsigned long state[N]; /* the array for the state vector */ static int left = 1; static int initf = 0; static unsigned long *next;

/* initializes state[N with a seed */ void initgenrand(unsigned long s)

{ int j; state[0]= s & OxffffffffUL; for (j=l; j<N; j++) { state[j] = (1812433253UL * (state[j-11 ^ (state[j-1] >> 30)) +

/* See Knuth TAOCP Vol2. 3rd Ed. P.106 for multiplier. */

/* In the previous versions, MSBs of the seed affect */

/* only MSBs of the array state[].

j);

162

}

/* 2002/01/09 modified by Makoto Matsumoto state[j] &= OxffffffffUL; /* for >32 bit machines */

} left = 1; initf = 1; static void nextstate(void)

{ unsigned long *p=state; int j;

/* if init-genrand() has not been called, */

/* a default initial seed is used if (initf==0) init-genrand(5489UL); left = N; next = state; for (j=N-M+l; -- j; p++)

*p = p[M] ^ TWIST(p[0], p[1]); for (j=M; -- j; p++)

*p = p[M-N] ^ TWIST(p[0], p11]);

*p = p[M-N] ^ TWIST(p[0], state[0]);

}

/* generates a random number on (0,1)-real-interval */ double genrand-real(void)

{ unsigned long y; if (--left == 0) nextstate();

y = *next++;

/* Tempering */ y ^=(y >> 11); y (y << 7) & 0x9d2c5680UL; y ^ (y << 15) & Oxefc60000UL; y ^=(y >> 18); return ((double)y + 0.5) * (1.0/4294967296.0);

/* divided by 2^32

*/

}

163

Visualization file (disp.vmd)

# This script loads the simulation results from the C code

# To execute, type "source filename" at the vmd console set filename coord.pdb

# Delete all existing graphics mol delete all rock off

# Draw simulation box

# Open coordinate file, discard first line, scan second line set fileId [open $filename] gets $fileId set a [gets $fileId] close $fileId scan $a %*s%f size draw color green draw line draw line draw line draw line draw line draw line draw line draw line draw line draw line draw line draw line

"0 0

"0 0

0" "0 $size

0" "$size 0

0"

0"

"0 $size 0" "$size $size 0"

"$size 0 0" "$size $size 0"

"0 0 $size" "0 $size $size"

"0 0 $size" "$size 0 $size"

"0 $size $size" "$size $size

"$size 0 $size" "$size $size

$size"

$size"

"0 0 0" "0 0 $size"

"$size 0 0" "$size 0 $size"

"$size $size 0" "$size $size

"0 $size 0" "0 $size $size"

$size"

# Load trajectories and optimize view mol load pdb $filename mol delrep 0 top mol representation VDW 0.4 7.0 mol color Name

# mol selection {all}

# mol material Opaque mol addrep top axes location off scale by 0.8 display projection perspective animate goto start

# animate style once

164

Download