Carnegie Mellon University Center for Advanced Process

advertisement
Chakrabarti Group
Overview of Research and Educational Initiatives
CAPD Meeting
March 11, 2013
Approaches to Molecular Design and Control
Static Optimization
Dynamic Control
milliseconds,
micrometers
Control of Biochemical
Reaction Networks
Molecular Structure/Function
Optimization: Enzyme Design
picoseconds,
nanometers
[protein pic]
femtoseconds,
angstroms
ms
Coherent Control of
Chemical Reaction Dynamics
How enzymes work
How to design them?
What makes them optimal for catalysis,
and how to improve?
Problem: hyperastronomical sequence
space
Catalytic Mechanisms of Enzymes
General acid/base
Y159
Electrostatic stabilizer
Lys65
Catalytic nucleophile
Glu-299
Catalytic
Nucleophile Ser62
DD-peptidase
General acid/base
Glu-200
b-gal
The physics in the model: sequence optimization requires accurate
energy functions and solvation models
S-GB continuum solvation
10o resolution rotamer library (297 proteins)
Xiang, Z. and Honig, B. (2001) J. Mol. Biol. 311: 421-430.
Ghosh, A., Rapp, C.S. & Friesner, R.A. (1998)
J. Phys Chem. B 102, 10983-10990.
OPLS-AA molecular mechanics force field + Glidescore semiempirical binding affinity scoring function
Friesner, R.A, Banks, J.L., Murphy, R.B., Halgren, T.A. et al. (2004) J. Med. Chem. 47, 1739-1749.
Jacobson, M.P., Kaminski, G.A. Rapp, C.S. & Friesner, R.A. (2002) J. Phys. Chem. B 106, 11673-11680.
A model fitness measure for enzyme sequence optimization
slack variable
N 1 N
J  seq   Gbind  seq    ij rij ,hbond  rij  seq    ij2 
i 1 j i
Enzyme-substrate
binding affinity
Catalytic constraint: interatomic
distances rij < hbond dist
• Minimize J over sequence space
• Represent dynamical constraint with requirement that total energy of complex
minimized for any sequence
• Omits selection pressure for product release
Computational sequence optimization correctly predicts most residues in
ligand-binding sites and enzyme active sites
Streptavidin
kcal/mol
Native –10.04
CO2- is covalent attachment site
for biomolecules
9 / 10 residues predicted correctly in top 0.5 kcal/mol of sequences
Chakrabarti, R., Klibanov, A.M. and Friesner, R.A. Computational prediction of native protein ligand-binding and enzyme
active site sequences. PNAS, 2005.
Computational active site optimization is structurally accurate
to near-crystallographic resolution
Rmsd to native (A)
1.2
1
0.8
0.6
0.4
0.2
0
Phe120 Asn161 Trp233 Arg285 Thr299
Ser326
Ser62
Lys65
Tyr159
From Enzyme Design to Bionetwork Control
•
Nature has also devised remarkable catalysts through molecular design / evolution
•
Maximizing kcat/Km of a given enzyme does not always maximize the fitness of a
network of enzymes and substrates
•
More generally, modulate enzyme activities in real time to achieve maximal
fitness or selectivity of chemical products
The Polymerase Chain Reaction: An example of
bionetwork control
Nobel Prize in Chemistry 1994; one of the
most cited papers in Science (12757 citations
in Science alone)
Produce millions of DNA molecules starting
from one through temperature cycling
Used every day in every Biochemistry and
Molecular Biology lab ( Diagnosis, Genome
Sequencing, Gene Expression, etc.)
How to automate choice of temperature cycling protocols?
Single Strand
– Primer
Duplex
Extension
D  S1  S2
k1m , k2m
DNA Melting
DNA Melting
Again
S1  P1  S1P1
k11 ,k21
k1 ,k2
S2  P2 
 S2 P2
2
2
Primer
Annealing
ke ,k  e
SP  E 
 E.SP
k n , k n
E.SP  N 
[ E.SP.N ] kcat
 E.D1
k n , k n
E.D1  N 
[ E.D1.N ] kcat
 E.D2
.
E.DN kcat

 E  DNA
'
k11t ,k21 t
S1  S2  DNA
3/18/2016
School of Chemical Engineering, Purdue
University
11
R. Chakrabarti and C.E. Schutt, Chemical PCR: Compositions for enhancing polynucleotide
amplification reactions. US Patent 7.772.383, issued 8-10-10.
R. Chakrabarti and C.E. Schutt, Compositions and methods for improving polynucleotide amplification
reactions using amides, sulfones and sulfoxides: II. US Patent 7.276,357, issued 10-2-07.
R.Chakrabarti and C.E. Schutt, US Patent 6,949,368, issued 9-27-05.
Optimal Control of DNA Amplification

Min CDNA  t f   C
T (t )
st
max
DNA

2
dx
 f  x, u 
dt
x  CS1 , CS2 ,.....CE .D1 .....CDNA 
Tr
For N nucleotide template –
2N + 13 state equations
Typically N ~ 103
R. Chakrabarti et al. Optimal Control of Evolutionary Dynamics, Phys. Rev. Lett., 2008
K. Marimuthu and R. Chakrabarti, Optimally Controlled DNA amplification, in preparation
Optimal control of PCR
95
90
85
Cycle 1
Temperature in Deg C
80
Cycle 2
75
70
Geometric growth:
after 15 cycles,
DNA concentrations
are
65
60
red – 4×10-10 M
blue – 8×10-9 M
green – 2×10-8 M
55
50
45
0
20
40
Annealing Time = 10 s
60
80
100
Time in Seconds
Annealing time = 12 s
120
140
Annealing time = 15 s
Chakrabarti Group Educational Initiatives: DecydEd
• DecydEd is an online course consortium with a two-prong objective:
1. Offer online education in systems engineering to a broader
community of students, researchers, and practitioners around
the world
2. Deliver fully automated real-time decision-making tools which
build upon the course material taught, to users for the first time
• DecydEd envisions broadening awareness of the latest academic research
in systems engineering, educating users on how to apply PSE tools to
industrial applications that have traditionally not been addressed using such
methods.
DecydEd (cont’d)
• DecydEd offers fully automated tools, based on the content covered in the
courses, aimed at solving real-world engineering problems in a host of areas
including
1. Systems Biology
2. Molecular Design
3. Financial Engineering
• Target applications include protein engineering, catalyst design, biochemical
reaction engineering
• Funded by PMC Group, Inc
PMC Group Global Operations
Fully integrated group of companies involved in development, manufacture,
marketing and sales of specialty, performance and fine chemicals. Among the
world’s top chemical manufacturers in several of these areas.
DecydEd Courses
The DecydEd User Portal
 The DecydEd User portal provides a rich experience to registered students, including
simulations, the ability to network with other users (using leading social media
platforms), collaborating on homeworks, viewing lectures, and solving automatically
graded homework exercises
DecydEd Discussion Forum
 DecydEd’s expert panel currently consists of professors from top universities
including CMU, the University of Chicago, the University of Toronto and the London
School of Economics
 Students can ask questions and get advice from these experts on a wide range of
topics while enrolled in the courses.
DecydEd’s Decision Making Tools in Chemical and Biochemical Engineering
•Molecular Design Example: Protein Engineering involves a high-dimensional search
over the space of possible functional groups in an active site.
•DecydEd’s automated protein optimization software will enable any molecular
biologist to apply computational protein engineering techniques
•Systems Biology Example: DNA sequencing involves the control of a biochemical
reaction network through the choice of temperature profiles in the polymerase chain
reaction (PCR).
•DecydEd’s automated PCR control software will enable molecular biologists to
apply systems biology in lab experiments through the website
•Most practicing molecular biologists are not trained in the above methods and
often do not have access to the latest tools
DecydEd Industry Application Example: Computational Enzyme Design
Design Computationally
Input information
System Output
Target chemical
Desired raw
material
Refine Experimentally
Zymzyne™
Computational
Design Process
~1000 potential
candidates
expected catalytic
activity
Zymzyne™
Experimental
Optimization
Existing synthetic
pathways
1030 candidates screened
Existing biocatalysts
500 candidates screened
Optimized
Biocatalyst
Computational Enzyme Design:
Enabling renewable chemical manufacturing
Starches
Plant oils
Biomass
DOE Top Value Added
Renewable Chemicals
1,4 succinic, fumaric and malic
acids
2,5 furan dicarboxylic acid
3 hydroxy propionic acid
aspartic acid
glucaric acid
glutamic acid
itaconic acid
levulinic acid
3-hydroxybutyrolactone
glycerol
sorbitol
xylitol/arabinitol
Specialty chemicals
Polymers
Enzyme Design Models
Protein structure
Loop
New algorithms
for side chain
optimization
Sidechain
Substrate binding
Glidescore
Pose sampling
QM sequence refinement
Classical
Sequence
Optimization
(fixed ligand)
Active site reshaping
• scores desired loop
against other low-energy
excitations
Reactive chemistry
• for QM/MM refinement
Calculating
mutant enzyme of enzyme design
• speeding up mutant
reaction rates
TS searches
Classical
Sequence
Optimization
(free ligand)
• Hierarchical pose screening
• Locates global seq/struct optima
for a given active site/ligand comb
• Estimates “designability” of active site
(fixed backbone)
DecydEd Molecular Design Decision-Making
Example of screening focused
library of sequence variants
3 permissible mutations identified by
modeling at a target position
3 positions subject to mutagenesis
43 mutation combinations
= 64 sequence variations
Synthetic gene assembly and variant
library construction via DNA synthesis
0.4
0.7
0.35
0.6
0.3
0.5
0.25
0.3
0.35
0.25
0.4
0.2
0.15
0.3
0.15
0.1
0.2
0.1
0.05
0.1
0.05
0.2
0
D A
F
R
S
N
E Y
H
I
L
K
N G
T W V
C
0
Biological selection of variant library
0
D A F R S N E Y H I L K N G T W V C
D A F R S N E Y H I L K N G T W V
New enzymes Improved catalytic turnover
Altered substrate selectivity
DecydEd Systems Biology Models
f r
S1  S2 
D
k ,k
Reaction
Equilibrium Information
 G 
k f / kr  K  exp  

 RT 
Relaxation Time

ΔG – From Nearest
Neighbor Model
Similar to the Time constant in Process Control

1
kr  k f CS 1eq  CS 2eq

τ – Relaxation time
(Theoretical/Experimental)
Solve above equations to obtain rate constants
K. Marimuthu and R. Chakrabarti, Sequence-Dependent Modeling of DNA Hybridization Kinetics:
Deterministic and Stochastic Theory, in preparation
DNA Amplification Control Problem and Cancer Diagnostics
Wild Type
DNA
Mutated
DNA
DecidEd Systems Biology Decision-Making Example
Feed the PCR State
Equations
Objective Function
(noncompetitive,
competitive)
DecydEd launched its business platform, called The Academic Financial Trading Platform
(AFTP) in November 2012, with engineering to follow in Summer 2013
The DecydEd Backend Technology
•The DecydEd backend collects the latest simulation, optimization and estimation
algorithms from the world’s top research centers
•The DecydEd Model API is an application Programming Interface (API) supports
integration of continuous influx of models with optimization and estimation algorithms.
• Instructors from both academia and industry can contribute models built using
standard modeling packages (e.g. AIMMS, GAMS) for use by DecydEd students
•The backend employs MPI-based parallel computing that is massively scalable
for large numbers of users with on-demand deployment of cloud instances
•PMC Group plans to integrate open source mathematical programming and
dynamic optimization libraries/solvers such as IPOPT, GLPK with the DecydEd backend
“f”, linear objective
function
Energy
constraint
Can only have
1 rotamer at
each position
No
“impossibles”
allowed
Nonlinear
constraint term
Possible collaborations to id the global optimum for fitness measure
(w pairwise decomposability assumptions, reduced energy model)
Download