Week 1 • Lecture 1: Introduction to cells and their contents. Proteins, polypeptide chains made from 20 amino acids. Paradigm shift in Molecular Biology from soft (descriptive) to hard (predictive) science. • Lecture 2: Quantum, classical and stochastic description of biomolecules. Classical molecular dynamics as the standard model of molecular biology. Cells Cells are the fundamental structural and functional units in organisms. They are the subject of Molecular Biology, Biochemistry & Biophysics with significant overlaps among the disciplines. This course will focus on computational aspects cellular biophysics. Two kinds of cells: • Prokaryotes (single cells, bacteria, e.g. Escherichia coli) Size: 1 mm (micrometer), thick cell wall, no nucleus The first life forms. Simpler molecular structures, hence easier to study. • Eukaryotes (everything else) Size: 10 mm, no cell wall (animals), has a nucleus, Organelle: subcompartments that carry out specific tasks e.g. mitochondria produces ATP from metabolism (the energy currency) chloroplast produces ATP from sunlight. Structure of a typical cell Plasma membrane Background: salt water Water (70%) is a highly viscous medium. It also has a very high dielectric constant (e=80), which screens charges. Ions (Na+, K+, Cl-,…) Typical concentration 0.15 M Debye length: 8 Å Mobile ions completely screen charges beyond few nm. no directed motion is possible in cells beyond few nm! Organic molecules Hydrocarbon chains (hydrophobic) Double bonds Functional groups in organic molecules Polar groups are hydrophilic. When attached to hydrocarbons, they modify their behaviour. Four classes of macromolecules in cells •Sugars (polysaccharides) : Functions: provide energy, scaffold in DNA and RNA structure, confer proteins stability after translation. •Lipids (triglycerides) Functions: long-term energy storage, lipid bilayer forms the cell membrane •Proteins (polypeptides) Functions: main workhorses in cells, perform all the mechanical and chemical operations, signal transduction, also provide structural elements. •Nucleic acids (DNA, RNA) Functions: Carries the genetic code, double-helix splits to replicate, contains the blueprint for all proteins. Simple sugars (monosaccharides): e.g. ribose (C5H10O5), Six-carbon sugars Glucose is a product of photosynthesis. Glucose and fructose have the same formula (C6H12O6) but different structure Disaccharides are formed when two monosaccharides are chemically bonded together. Lipids (fatty acids) Saturated fatty acids Unsaturated fatty acid (C=C bonds) Phospholipids (lipids with a phosphate head group) Phosphatide: In neutral pH (7), the oxygens in the OH groups are deprotonated, leading to a negatively charged membrane. Phosphatidylcholine (PC): The most common phospholipid has a choline group attached ….PO4-CH2-CH2-N+-(CH3)3 Proteins (polypeptide chains folded into functional forms) The building blocks of proteins are the 20 amino acids. In water pH 2 10 H 2 pH 10 Gas phase NH3+ - C - COOH NH3+ - C - COO NH2 - C - COO - At normal pH, amino acids are neutral but have +/- charges at the amino/ carboxyl groups (zwitter ions). Formation of polypeptides Gas phase reaction In water: NH 3+ - C - COO - + NH 3+ - C - COO NH 3+ - C - CO - NH - C - COO - + H 2O Protein structure 3.6 amino acids per turn, r=2.5 Å pitch (rise per turn) is 5.4 Å -helix b -sheet Nucleic acids are formed from ribose+phosphate+base pairs The base pairs are A-T and C-G in DNA (A pairs only with T and C pairs only with G). In RNA Thymine is substituted by Uracil Adenosine triphosphate (ATP) has three phosphate groups. In the usual nucleotides, there is only one phosphate group which is called Adenosine monophosphate (AMP) Another important variant is Adenosine diphosphate (ADP) B-DNA (B helix) ROM (Read-Only Memory) contains1.5 Gigabyte of genetic information Base pairs per turn (3.4 nm): 10 Primary structure of a single strand of DNA Primary structure of a single strand of RNA Hydrogen bonds among the base pairs A-T and C-G Local structure of DNA Dynamic and flexible structure Bends, twists and knots Essential for packing 1 m long DNA in 1 mm long nucleus Central dogma Paradigm shift in Molecular Biology BBC news, July 3, 2104: 99.6% of drug trials for Alzheimer’s disease during the last decade have failed (i.e., only 1 out of 250 succeeded). This is not specific to Alzheimer’s disease but pervades the whole pharma. All the low-lying fruits have been picked and to find novel drugs one has to do more than trial-and-error work. The answer is in rational drug design which combines experimental work with computational models of drug action. Biomolecular systems are quite complex and their accurate modelling requires a great deal of computing power. This has become feasible in the last decade with the advance of the High Performance Computing systems based on parallel clusters of PCs, which are more affordable. Computational work has now become cheaper and less laborious than performing experiments. The main barrier in turning Molecular Biology in to a hard science like Physics and Chemistry is convincing people that biomolecular systems can be accurately described using computational methods. Chemical accuracy has been achieved in relatively few examples so far and much more work (both in applications and methodological development) needs to be done to complete the paradigm shift. Model/ theory Scientific method Exp. data Predic tion Exp. test Quantum, classical and stochastic description of biomolecular systems • Quantum mechanics (Schroedinger equation) The most fundamental approach but feasible only for few atoms (~10). Approximate methods (e.g. density functional theory) allows treatment of larger systems (~1000) and dynamic simulations for several picoseconds. • Classical mechanics (Newton’s equation of motion) Most atoms are heavy enough to justify a classical treatment (except H). The main problem is finding accurate potential functions (force fields). MD simulation of over 100,000 atoms for microseconds is now feasible. • Stochastic mechanics (Langevin equation) Most biological processes occur in the range of microseconds to seconds. Thus to describe such processes, a simpler (coarse-grained) representation of atomic system is essential (e.g. Brownian dynamics). Many-body Schroedinger equation for a molecular system H n + H e + U ne R i , r E R i , r where: (1) zi z j e 2 2 2 H n - i + i 2M i i j Ri - R j nuclear Hamiltonian 2 2 e2 H e - + 2m b r - rb electronic Hamiltonian U ne zi e 2 - i , R i - r elect-nucl. interaction Here m and Mi are the mass of the electrons and nuclei, r and Ri denote the electronic and nuclear coordinates, and and i denote the respective gradients. Separation of the electronic wave function Nuclei are much heavier and hence move much slower than electrons. This allows decoupling of their motion from those of electrons. Introduce the product wave function: Ri , r n Ri e Ri , r Substituting this in the Schroedinger equation gives H n + H e + U ne n Ri e Ri , r E n Ri e Ri , r For fixed nuclei, the electronic part gives H e + U ne e Ri , r Ee Ri e Ri , r (2) Substitute the electronic part back in the Schroedinger equation H n + Ee Ri n Ri e Ri , r E n Ri e Ri , r Born-Oppenheimer (adiabatic) approximation consists of neglecting the cross terms arising from H n e R i , r (which are of order m/M), so that the nuclear part becomes H n + Ee Ri n Ri E n Ri (3) Eqs. (2, 3) need to be solved simultaneously, which is a formidable problem for most systems. For two nuclei, there is only one coordinate for R (the distance), so it is feasible. But for three-nuclei, there are 4 coordinates (in general for N nuclei, 3N-5 coordinates are required), which makes numerical solution very difficult. Classical approximation for nuclear motion Nuclei are heavy so their motion can be described classically, that is, instead of solving the Schroedinger Eq. (3), we solve the corresponding Newton’s eq. of motion d 2R i Mi -iU R i i 1,, N 2 dt U R i i j zi z j e 2 Ri - R j (4) + Ee R i At zero temperature, the potential can be minimized with respect to the Nuclear coordinates to find the equilibrium conformation of molecules. At finite temperature, Eqs. (2) and (4) form the basis of ab initio MD (ignores quantum effects in nuclear motion and electronic exc. at finite T.) Methods of solution for the electronic equation Electronic part of the Schroedinger Equation (2) has the form zi e 2 2 2 e2 + - - i 2m b r - rb i , R i - r e R i , r Ee R i e R i , r Two basic methods of solution: 1. Hartree-Fock (HF) based methods: HF is a mean field theory. One finds the average, self-consistent potential in which electrons move. Electron correlations are taken into account using various methods. 2. Density functional theory: Solves for the density of electrons. Better scaling than HF (which is limited to ~10 atoms); 1000’s of atoms. Car-Parrinello MD (DFT+MD) has become popular in recent years. (5) Classical mechanics Molecular dynamics (MD) is the most popular method for simulation studies of biomolecules. It is based on Newton’s equation of motion. For N interacting atoms, one needs to solve N coupled DE: Mi d 2ri dt 2 N Fi Fij , i 1,, N j i Force fields are determined from experiments and ab initio methods. Analytically this is an intractable problem for N>2. But we can solve it easily on a computer using numerical methods. Current computers can handle N=~106 particles, which is large enough for description of most biomolecules. Integration time, however, is still a bottleneck (106 steps @ 1 fs = 1 ns) Stochastic mechanics (Brownian dynamics) In order to deal with the time bottle-neck in MD, one has to simplify the simulation system (coarse graining). This can be achieved by describing parts of the system as continuum with dielectric constants. Examples: • transport of ions in electrolyte solutions (water → continuum) • protein folding (water → continuum) • ion channels (lipid, protein, and water → continuum) To include the effect of the atoms in the continuum, modify Newton’s eq. of motion by adding frictional and random forces: Langevin equation: d 2ri mi 2 Fi - i mi v i + R i dt Frictional forces: Friction dissipates the kinetic energy of a particle, slowing it down. Consider the simplest case of a free particle in a viscous medium d 2r dv m 2 -mv -v dt dt Solution with the initial values of v(0) v0 , r(0) 0 v (t ) v 0e -t r (t ) v0 1 - e-t In liquids frictional forces are quite large, e.g. in water 1/ 20 fs From 1 2 3 mv kT 2 2 v 500 m/s and v 0.1 Random forces: Frictional forces would dissipate the kinetic energy of a particle rapidly. To maintain the average energy of the particle at 1.5 kT, we need to kick it with a random force at regular intervals. This mimics the collision of the particle with the surrounding particles, which are taken as continuum and hence not explicitly represented. Properties of random forces: 1. Must have zero mean (white) Ri 0, i x, y, z 2. Uncorrelated with prior velocities vi (0) R j (t ) 0 3. Uncorrelated with prior forces Ri (0) R j (t ) 2m kT (t ) ij (Markovian assumption) Fluctuation-dissipation theorem: Because the frictional and random forces have the same origin, they are related 1 m R(0) R(t ) dt 2kT - R(0)R(t ) t In liquids the decay time is very short, hence one can approximate the correlation function with a delta function R(0) R(t ) 2m kT (t ) Random forces have a Gaussian probability distribution w( Ri ) 1 2 Ri2 Ri2 exp - Ri2 2 Ri2 2mkT t This follows from the fact that the velocities have a Gaussian distribution g (vi ) N m exp - mvi2 2kT 2kT In order to preserve this distribution, the random forces must be distributed likewise. The standard model of biomolecules: MD • MD is necessary because: 1. QM is too slow and can handle only very small systems. 2. Stochastic dynamics eliminates water from the system. But water is not just a passive spectator in biomolecular processes - it plays an active and essential role in the dynamics. For example, accurate calculation of free energies is impossible without explicit description of water (except in a few lucky cases where errors cancel out). • Also MD is sufficient because atoms are heavy enough to justify a classical treatment (except H). The only requirement is that accurate potential functions must be used, which is not quiet satisfied at present; polarization int. is not explicitly included in most force fields. 49