• Levinthal's paradox presents an estimate for the time it would take for a protein to fold assuming a minimum of two possible conformations for each pair of amino acids.
For a 101 residue protein, there are 2^100 ~10^30 possible conformations. If it takes 1 ps to sample each conformation, it would take 10^18 s to sample the whole phase space to find the absolute minimum of the free energy. This is longer than the age of the universe, 5x10^17 s!
How do proteins manage to fold in seconds?
(Nelson, chap. 8)
Molecular machines in cells use chemical energy to function, and most of the time they do chemical work, e.g. synthesize proteins.
To deal with this situation we need to consider more than one species and allow exchange of particles as well as energy.
Let {N a
, a=1, 2 ,…} denote the numbers of each species in a system
The entropy of the system is a function of this set: S(N
1
, N
2
, …)
We define the chemical potential for each species a as
a
=
T
S
N a
E , N
, a
(availability of particles)
Recall the definition of the temperature (modified from the fixed N case)
1
T
=
S
E
N a
(availability of energy)
When two systems, A and B, exchange only energy, thermal equilibrium is realized when T
A
= T
B
If they also exchange particles, their chemical potentials for each species a must be equal as well
A , a
=
B , a (chemical equilibrium)
When two systems are not in chemical equilibrium, entropic forces arising from the difference in chemical potentials drive the system to equilibrium.
As a simple example consider an ideal gas. The entropy is given by
S
= k ln
3 N
( 3
2
N
( 2 mE
K
2
1 )!
) h
3 N
3 N
2
V
N !
N
(Sakure-Tetrode formula)
Rewrite S using Stirling’s formula
S
= k
3 N
2 ln( 2
mE
K
)
N ln V
3 N
2 ln
3 N
2
3 N
2
3 N ln h
N ln N
N
S
N
E
K
= k
3
2 ln
2
mE
K
ln V
3
2 ln
3 N
2
3
2
3
2
3 ln h
ln N
1
1
=
3 k
2 ln
2
m h
2
2 E
K
3 N
V
N
2 3
=
3 k
2 ln
2
m h
2 c kT
2 3
To obtain the chemical potential we need to keep the total energy
E
=
E
K
N e fixed, where e is the internal energy. This is achieved by
S
N
E
=
S
N
E
K
e
S
E
K N
The last term is just e
/T. Substituting, we obtain for the chemical pot.
=
T
3
2 k ln
2
m h
2 kT c
2 3
= e
3
2 kT ln
2
m h
2 kT c
2 3
e
T
Because we are interested in the number (or concentration) dependence of the chemical potential, we separate that term
= kT c ln c
0
e
3
2 kT ln
2
m h
2 kT c
0
2 3
= kT c ln c
0
0
Where c
0 is the reference concentration and
0 is the standard chemical potential
The reference concentration is introduced for convenience, it’s choice has no effect on the chemical potential. Convention:
For gases at STP: c
0
= 1 mole/22 L = 0.045 M
For aqueous solutions: c
0
= 1 mole/L = 1 M
Notation: [X] = c x
/c
0
(e.g., [X] = 1 refers to a 1 molar solution)
Rewrite the chemical potential as c c
0
= e
0
kT (activity)
For ideal gases, the activity is simply given by the relative concentration.
For solutions, the definition of
0 is more complicated. But if we treat it as a phenomenological parameter, we can use the same formulas for dilute solutions.
We can generalize the chemical potential by including the potential energy of the particles in the internal energy: 0 0
U
In the case of charged particles,
U
= qV (
)
= kT c ln c
0
0 qV (
) is called the electrochemical potential.
Electrochemical equilibrium between two systems is achieved when
A
=
B
kT ln c
A c
0
0 qV
A
= kT ln c
B c
0
0 qV
B q ( V
A
V
B
)
= kT ln c
A c
B
(Nernst relation)
Chemical reactions are controlled by the chemical potential of reactants: high concentration or high internal energy means higher availability.
Generalization of the Boltzmann distribution for particle exchange:
Consider a small system “a” which can exchange both energy and particles with a much larger system B. Fluctuations in E
B and N
B are negligible but those in E a and N a could be large. As we have shown before, the probability of “a” being in a state with E a and N a is proportional to exp[S
B
(E
B
)/k]
S
B
( E
B
, N
B
)
=
S
B
( E tot
E a
, N tot
N a
)
At equilibrium,
=
S
B
( E tot
, N tot
)
E a
S
E
B
B
N a
S
N
B
B
=
S
B
( E tot
, N tot
)
E
T
B a
N a
B
T
B
T a
=
T
B
=
T ,
a
=
B
=
Thus the probability of “a” being in a particular state j with E j and N j is proportional to
P ( E j
, N j
)
e
(
E j
N j
) kT
Using the grand partition function, Z = j e
(
E j
N j
) kT
The normalized probability becomes
P ( E j
, N j
)
=
1
Z e
(
E j
N j
) kT which is called the grand canonical distribution.
The number of particles “a” contains depends on
; the larger
is, the more particles “a” will have.
Chemical reactions:
As a simple case, consider a molecule which has two states with internal energies e
1 and e
2
(e.g. an isomer). Assume e = e
2
e
1
0
The chemical potentials are
i
= kT ln c c
0 i i
0
,
Chemical equilibrium:
1
=
2
i
0 = e i
, i
=
1 , 2 c kT ln c
0
1 e
1
= kT c ln c
0
2 e
2
c
2 c
1
= e
e kT
Non-equilibrium cases:
1.
1
2
Reaction 1
2 proceeds (entropic forces do chem. work)
2.
1
2
Reaction 2
1 proceeds (chemical en. converted to heat)
To summarize, when two systems are at equilibrium:
• Temperatures and chemical pot’s are equal,
T
1
=
T
2
,
• Total entropy S is maximum
• Total free energy (F or G) is minimum
1
=
2
Example: Burning of hydrogen
2 H
2
O
2
2 H
2
O
Free energy change:
G
=
2
H
2
O
2
H
2
O
2
At equilibrium:
S
=
G
T
=
0
Assuming an ideal gas behaviour for all three participants, we can write for the free energy change
G
=
2 kT
ln
c
H
2
O c
0
2
0
H
2
O
= kT ln
c
H
2
O c
0
2
c
H
2 c
0
2 kT ln
c
H
2 c
0
2
c
O
2 c
0
1
2
0
H
2
2
0
H
2
O
kT
ln
c
O
2 c
0
2
0
H
2
0
O
2
0
O
2
G
=
0
c
c
H
H
O
2
2
2
2 c
0 c
O
2
= e
( 2
0
H
2
O
2
0
H
2
0
O
2
) kT =
K eq
Here K eq is called the equilibrium constant of the reaction, and the ratio
H
c
2
H
2
O
2
c
O
2
=
K eq c
0 is called the reaction quotient
Often a log scale is used for K eq
:
K eq
=
10
pK
, pK
= log
10
K eq
For ideal gases (or dilute solutions), we can use the explicit expression derived for the standard chemical potential
0 = e
3
2 kT ln
2
m h
2 kT c
0
2 3
K eq
= e
( 2
0
H
2
O
2
0
H
2
0
O
2
) kT
= e
( 2 e
H
2
O
2 e
H
2
e
O
2
) kT
2
m
H
2
O h
2 kT c
0
2 3
3
2
m
H
2 h
2 kT c
0
2 3
3
2
m
O
2 h
2 kT c
0
2 3
3 / 2
= e
( 2 e
H
2
O
2 e
H
2
e
O
2
) kT c
0
h
2
2
kT m
m
H
2
H
2
O
2 m
O
2
3 / 2
At low T, reaction favours H
2
O. As T increases H
2 and O
2 conc. also inc.
From chemical data handbooks, K eq at room temperature is given by:
K eq
= e
( 2
0
H
2
O
2
0
H
2
0
O
2
) kT = e
183
Clearly almost all the hydrogen will burn. Using the reaction quotient c
H
2
H
2
2
2
O c
0 c
O
2
=
H
2
2
O
2
2
O
2
=
K eq estimate the number of O
2 molecules left from 1 mole of O
2 gas
[ 2 ]
2
[ 2 x ]
2
[ x ]
e
183
[ x ]
= e
183 / 3 =
3
10
27 n ( O
2
)
=
[ x ] N
A
=
3
10
27
6
10
23
0 .
002
None at all!
Generalization to arbitrary reactions:
Assume n species involved in a reaction; k reactants and m-k products n
1
X
1
n k
X k
n k
1
X k
1
n m
X m where n k are called the stoichiometric coefficients of the reaction
The free energy difference is
G
= n
1
1
n k
k
n k
1
k
1
n m
m
The reaction runs forward when
G < 0 and backward if
G > 0.
G = 0 corresponds to equilibrium. Again we separate the concentration dependent part from the rest
G
=
=
n
1
kT kT ln c
1 c
0
ln[ X
1
]
n
1
1
0
n m
kT ln c m c
0
ln[ X m
] n m
n
1
1
0
0 m
n m
0 m
Setting
G = 0, we obtain
X k
1
n n k
1
1
X
X m n k
n m
= e
G
0 kT =
K eq where
G 0 is the standard free energy change
(Mass action rule)
G
0 n
1
1
0 n m
0 m
The values of
G 0 for formation of molecular species can be found in chemistry handbooks (usually at STP; 298 K and 1 atm)
When more than one reaction occurs at similar rates, there is a separate mass action rule for each reaction, which implies relations between the various
a
.
Reaction Kinetics:
Consider a typical reaction with rate constants k
+ and k
X
2
Y
2
k
2 XY
Intuitively we expect the forward and backward rates to be proportional to the concentrations of molecules (first order reaction) r
= k
c
X
2 c
Y
2
, r
= k
( c
XY
)
2
At equilibrium, r
= r
( c
XY
)
2 c
X
2 c
Y
2
= k k
= e
G
0 kT =
K eq
(mass action rule)
The above is true for single step reactions. For more complex reaction mechanisms, concentration dependence of rates may be different.
In general, a reaction is of n’th order in species X if the rate depends on its concentration as (c
X
) n .
An alternative 3-step mechanism for the previous reaction, which is second order in X
2 and zeroth order in Y
2
:
X
2
X
2
2 X
X
2
(slow, rate limiting step)
X
Y
2
X
XY
2
XY
2
2
XY
(fast)
(fast)
Each step must be in equilibrium
( c
X
( c
X
2
)
2 c
X
2
)
2 c
0
=
K eq , 1
, c
XY
2 c
X c c
Y
2
0 =
K eq , 2
,
Product:
( c
XY
)
2 c
X
2 c
Y
2
=
K eq , 1
K eq , 2
K eq , 3
=
K eq
( c c
X
XY c
)
2
XY
2
=
K eq , 3
(mass action rule is independ. of mechanism)
Dissociation:
Salts, acids, bases and polar molecules readily dissolve in water because the loss in potential energy is more than compensated by the interaction of the charged parts with water molecules (charge-dipole and H-bond) and gain in entropy.
Example: Dissociation of water
H
2
O
H
OH
(proton + hydroxyl)
From conductance measurements in pure water: c
H
= c
OH
=
10
7
M
Mass action rule: K w
=
[ H
][ OH
[ H
2
O ]
]
=
2
=
10
14
(
G
0 =
32 kT )
Adding an acid (e.g. HCl) increases [H + ] and hence lowers [OH
]
Adding a base (e.g. NaOH) increases [OH
] and hence lowers [H + ]
In chemistry, the amount of protons in a solution is described by its pH pH
= log
10
[ H
]
Pure water has pH = 7, which is called normal pH
Adding acids in water lowers pH. A solution with pH < 7 is called acidic
Adding bases in water raises pH. A solution with pH > 7 is called basic
Common acidic and basic groups in organic molecules:
Carboxyl group
Amine group
COOH
NH
3
protonated
COO
NH
2
H
H
deprotonated
Of the 20 amino acids, aspartate and glutamate have acidic side chains while arginine and lysine (~histidine) have basic side chains.
Probability of protonation of a side chain
P a
=
[
COOH
COOH
]
[
COO
]
=
1
[
COO
1
] [
COOH ] equilibrium:
[
COO
][ H
[
COOH ]
]
=
K eq , a
[
COO
]
[
COOH ]
=
[
K eq , a
H
]
P a
=
1
1
K eq , a [ H
]
K eq , a
=
10
pK
, [ H
]
=
10
pH
P a
=
1
10
1
pK
pH
=
1
1
10 x a
,
When pH
= pK , P a
=
1 / 2 x a
= pH
pK
Examples:
Aspartic acid: K eq
= 10
3.7
Arginine:
P a
= 1/(1+10
3.3
)
0 (has charge –e)
K eq
= 10
12.5
P a
= 1/(1+10
5.5
)
1 (has charge +e)
The average charge on a side chain is determined by P a
Acidic side chain: q = –e (1 – P a
)
Basic side chain: q = e P a
Note that the pH of the solution controls the protonation state of a protein.
In titration experiments, the pH is varied over a wide range, e.g. 1-12.
When pH < pK of all the side chains, all are protonated (max + charge)
As pH increases, and goes through pK of an acidic side chain, q = 0
–e
Beyond pH = 7, basic side chains start deprotonating, q = +e
0
For pH > pK of all the side chains, all are deprotonated (max – charge)
Titration curve of ribonuclease. As pH is raised protein loses protons.
Electrophoresis:
As the titration curve indicates, apart from a critical pH value, proteins carry a net charge and hence will move under an applied electric field.
This process is called electrophoresis.
A common application is separation of proteins, which is achieved by setting the pH of the solution at the critical value of the protein we want to separate and applying an electric field.
Varying pH and measuring the electrophoretic mobility, one can determine the critical pH value precisely.
A famous example is Pauling’s finding of the cause of sickle-cell anemia.
Patients carry a defective hemoglobin that differs from the normal one by a single point mutation, Glu
Val. Glu has –e (pK = 4.25), Val is neutral.
At pH = 6.9, the two proteins migrate in opposite directions!
Self-assembly of amphiphiles:
How do the cell membranes form?
Amphiphiles: molecules that have both hydrophilic (polar) and hydrophobic (CH
2 chain) parts (detergents, lipids)
Sodium dodecyl sulfate
(SDS)
Phosphatidylcholine
When detergents are added to oil-water mixtures, they form a boundary between the two such that the polar head groups face water and hydrophobic tails face oil
Oil-water interface stabilized by detergent oil-water emulsion
Micelle formation:
When detergent is added in pure water, they form small spherical objects just like in the emulsion case. The only difference is that the tails avoid water by facing each other.
N=5
N=30
Osmotic pressure: P = ckT (McBain, 1944)
Let the number of detergent molecules in a micelle be N, and denote the concentration of micelles by c
N and monomers by c
1
The reaction is: (N monomers)
(micelle)
Mass action rule (MAR) at equilibrium: c tot
= c
1
Nc
N
= c
1
1
c
N
( c
1
)
N
NK eq
( c
1
)
N
1
=
K eq
Experimentally measured quantity is the critical micelle concentration, c
* which is defined as, c tot
= c
* substitute in MAR substitute in c tot when c
1 *
=
Nc
N *
= c
*
2
c
* c tot
2 N
c
*
2
N
= c
1
1
( 2 c
1
=
K eq c
*
)
N
1
NK eq
=
( 2 c
*
)
N
1
For 2c
1
<< c
*
, c tot
= c
1
, while for 2c
1
>> c
*
, c tot
= Nc
N
Coarse-grained models of lipid aggregation:
United atom models of lipids
Micelle formation (Klein et al. 2004)
Bilayer formation (Marrink et al. 2001)
(Nelson, chap. 9)
Biological molecules usually have two distinct conformations: random coil form of the polypeptide chain and a folded compact form.
Examples:
• Helix-coil transition in a simple amino acid chain
• Full folding of a protein from random coil to a compact 3D structure
• An extreme example is the condensation of DNA, where the full length of about 1 m is squeezed into a micron size nucleus.
An important parameter in characterizing the elasticity of polymers is the persistence length, which determines the length scale for bending of the chain of molecules. Persistence length is typically about 1 nm for polymers, which are very flexible. In contrast, it is about 100 nm in DNA, which is relatively very rigid.
Elasticity model of polymers:
If we model polymers as a continuous elastic object, there are three possible deformations: a) bending, b) stretching, c) twisting (torsion)
Because the covalent bonds in polymers are quite rigid and the torsional motion is restricted, only the bending deformation is important dE
=
1
2 kTA
2 ds
= d
ˆ ds
,
= d
ds
=
1
R
E
=
1
2 kTL p
L tot
0
2 ds
E
=
1
2 kTL p
1
R
2
2
R
4
=
L p
4 R kT
(for ¼ circle)
Stretching of DNA (experimental data from DNA of lambda phage)
A force of few pN is sufficient to fully stretch DNA from a random coil. At
65 pN DNA takes another form, where the backbone is straightened.
Two-state model of DNA stretching (freely jointed chain model in 1D)
Assume DNA consists of N segments of length L s
, which can be oriented in +z or –z direction. Apply a force f in the z direction to stretch it.
The corresponding potential is U= –fz where z is the DNA length given by z
=
L s i
N
=
1
i
, with
i
=
1
Probability of a particular configuration [ σ
1
,….,σ
N
] is given by the
Boltzmann factor
P (
1
, ,
N
)
=
1
Z e fL s
i
N
=
1
i
kT
Where Z is the partition function
Z
=
1
=
1
N
=
1
P (
1
, ,
N
)
Average DNA extension under a load f z
=
1
=
1
N
=
1
P (
1
, ,
N
) z
=
1
Z
1
=
1
N
=
1 e fL s
i
N
=
1
i
kT
L s i
N
=
1
i
= kT d df ln
1
=
1
N
=
1 e fL s
i
N
=
1
i
kT
= kT d df ln
1
=
1 e fL s
1 kT
N
=
1 e fL s
N
= kT d df ln e
fL s kT e
fL s kT
N kT
Taking the derivative wrt f gives z
= e fL s
NL s e fL s kT kT
e
e
fL s fL s kT kT
Introducing L tot
=NL s z
L tot
= tanh fL s kT
The limiting cases:
1) High force (f>>kT/L s
), z
L tot
2) Low force (f<<kT/L s
), z
L tot fL s kT
= f k
,
k
= kT
L tot
L s
At low force, a polymer behaves like a spring, obeying Hooke’s law
Comparison of theory with experimental data from lambda phage
L s
=35 nm
L s
=104 nm
Long-dash curve: 1D cooperative chain model (includes elastic energy)
Short-dash curve: 3D freely jointed chain model
Helix-coil transition (experimental data from an artificial polypeptide)
• At a critical temperature polypeptide makes a transition from coil to helix
• Transition is sharpened with the number of residues (cooperative effects)
Energetics of helix-coil transition
The free energy change in the transition is given by
G
=
E bond
T
S tot
E bond
=
E helix
E coil
S tot
=
S bond
S conf
From experiments:
E
0 ,
S conf
k ln( 3
3 )
0
S bond
0 ,
S tot
0
Introduce a parameter a =
E bond
T
2 kT
S tot which measures favourability of extending the helix formation
When a vanishes, extending the helix by one unit makes no change in the free energy a =
0
T m
=
E bond
S tot a =
E bond
2 k
1
T m
1
T
=
E bond
2 k
T
T m
TT m
Using a cooperative 1D freely jointed model and
=
C
1
C
2 sinh
2 sinh a a e
4
The curves in the previous figure are obtained by fitting this expression to the data points.
Protein folding:
Primary sequence determines the folded structure
The free energy gain from folding is about 20 kT
Loss of entropy is compensated by H-bond formation and especially hydrophobic interactions (Kauzmann, 1950s).
Changes in the environment can lead to denaturation (unfolding) of proteins. For example, proteins unfold
at both high (T > 50 C) and low (T < 20 C) temperatures
in nonpolar solvents
in the presence of small amounts of surfactants
MD simulations of protein unfolding at high temperatures (Daggett et al.)
Unfolding of engrailed homeodomain
Folding time at
298 K, ~1 ms
Unfolding of chymotrypsin inhibitor (Daggett et al.)
Potential energy landscapes for protein folding: a) Flat
(i.e. Levinthal) b) Ant trail c) Smooth funnel d) Rugged funnel
Rugged protein folding pathways from lattice calculations (Dill et al)