talkUPP04 - Duke University

advertisement
1
Molecular Computations Using Self-Assembled DNA
Nanostructures and Autonomous Motors
John H. Reif
Department of Computer Science, Duke University
DNA Nanostructures:
DNA tiles: composed of a few strands of DNA that self-assemble (via
DNA annealing) into a roughly rectangular shape.
TX tiles: 3 double stranded DNA with Holiday junctions
Approx 20 Angstroms wide
DNA TX tile Nanostructures:
• 3 double stranded DNA with Holiday junctions
• Approx 20 Angstroms wide
2
4
GTTCAGCCTTAGT
CCACAGTCACGGATGG
ACTCGATAGCCAA
CAAGTCGGAATCA
GGTGTCAGTGCCTACC
TGAGCTATCGGTT
ACTCC
TGGCATCTCATTCGCA
GGACA
TGAGG
ACCGTAGAGTAAGCGT
CCTGT
T
TCTGG
T
T
AGACC
T
1
4
CATCTCGT
CCTTGCGTTTCGCCAATCCAGAAGCC
GTAGAGCA
GGAACGCAAAGCGGTTAGGTCTTCGG
TAO35
3
1
T
T
T
CCATC
T
GGTAG
3
TGCGAGCA
ACGCTCGT
2
TAE 44
Strand topology traces of TX tiles.
•‘T’ denotes three DNA helices
•‘A’ denotes anti-parallel crossovers
(strand changes direction of propagation at crossover points)
•‘O’ and ‘E’ are odd and even, respectively:
refer to the number of helical half-turns between adjacent crossover points.
DNA Hybridization and Ligation Operations.
Hybridization of sticky single-strand DNA segments.
Ligation: If the sticky single-strand segments that anneal abut
doubly stranded segments of DNA, you can use an enzymic
reaction known as ligation to concatenate these segments.
Unique Sticky Ends on DNA tiles. Input layers can be assembled via
unique sticky-ends at each tile joint thereby requiring one tile type for each
position in the input layer.
Tiling self-assembly: proceeds by the selective hybridization of the pads
of distinct tiles, which allows tiles to compose together to form a
controlled tiling lattice (these pads determine the form of the tiling
that self-assembles).
Technical Challenge: Optimal Design of strands composing tiles.
2D DNA Self-Assembled Tilings:
Rendering Simple Banded Images
B* Tiles with Loops
Atomic Force Microscope Image
Bands Generated by B* Tiles
with Attached Beads
Major Goals of Duke DNA Nanotechnology
(Reif’s Group)
1) Patterned DNA Arrays.
• Set of specific tiles which form patterns.
• Assembly around scaffold strands.
• Molecular fabric.
2) Computation via DNA Self-Assembly
• Reporter strand output (requires ligation).
• Microscopic readout (via AFM, TEM, SEM, etc.).
3) Applications of DNA-Based Assemblies.
• Molecular and nano-scale electronics.
• Molecular motors and actuators.
4) Software for Design and Simulation of DNA Assemblies.
Background Literature on DNA Self-Assembled Tiling Lattices.
First Paper:
[Winfree and Seeman,98] The first experimental demonstration of self-assembly
of DNA to construct 2D lattices consisting of up to a hundred thousand DNA
tiles.
Recent Review papers:
• [Reif, LaBean, Seeman, 2000]
Challenges and Applications for Self-Assembled DNA Nanostructures.
• [J. H. Reif, IEEE Computer and Scientific Engineering Magazine, 2002]
DNA Lattices: A Method for Molecular Scale Patterning and
Computation.
• [LaBean, Yan, Park, Feng, Yin, Li, Ahn, Liu, Guan, and Reif, Information
Sciences, 2004] Overview of New Structures for DNA-Based Nanofabrication
and Computation.
Reif Group’s Papers on DNA Self-Assembled Tiling Lattices.
•
[LaBean, Winfree, Reif,
& Seeman, J. Am. Chem. Soc. 2000] Improved DNA
nanostructures: TX tiles that have multiple DNA strands running them.
•
[Mao, LaBean, Reif, Seeman, Nature 2000] Experimentally demonstrated for the first time
a computation using self-assembled DNA lattices of TX tiles that self-assembled around
input strands running through the tiles.
•
[Yan, Feng, LaBean, and Reif, JACS, 2003] DNA Nanotubes, Parallel Molecular
Computation of Pair-Wise XOR Using DNA String Tile.
•
[Yan, LaBean, Feng, and Reif, PNAS, 2003] Directed Nucleation Assembly of Barcode
Patterned DNA Lattices.
•
[Yan, Park, Finkelstein, Reif, & LaBean, Science, 2003] DNA-Templated Self-Assembly of
Protein Arrays and Highly Conductive Nanowires.
•
[Feng, Park, Reif, and Yan, Angewandte Chemie, 2003] A Two State DNA Lattice Actuated
by DNA Motors.
•
[Li, Park, Reif, LaBean, Yan, JACS 2004] DNA Templated Self-Assembly of Protein and
Nanoparticle Linear Arrays.
•
[Liu, Reif, LaBean, PNAS 2004] Improved DNA nanostructures: DNA nanotubes selfassembled from triple-crossover tiles as templates for conductive nanowires.
Molecular Pattern Formation using Scaffold
Strands for Directed Nucleation:
• Multiple tiles of an input layer can be assembled around a single, long DNA strand
we refer to as a scaffold strand (shown as black lines in the figures).
A
B
C
o Examples of Arrangements of Scaffold Strands :
– (A) Diagonal TAO layer which partially defines binding slots for tiles of the next
successive layer.
– (B) Horizonal layer of alternating TAE and DAE tiles.
– (C) crenellated horizontal layer which could be comprised of TAE or DAE tiles.
Structures in B and C completely define binding slot for tiles on next layers.
Directed Nucleation Assembly of Barcode
Patterned DNA Lattices (PNAS 2003)
Hao Yan, Thomas H. LaBean, Liping Feng, John H. Reif
Aperiodic patterned DNA lattice (Barcode Lattice) by
directed nucleation self-assembly of DNA tiles around a
scaffold DNA strand.
A first step toward implementation of a visual readout system
capable of converting information encoded on a 1-D DNA strand
into a 2-D form readable by advanced microscopic techniques.
12
TileSoft: Sequence Optimization Software
for Designing DNA Secondary Structures
P. Yin*, B. Guo*, C. Belmore*, W. Palmeri*, E. Winfree†,
T. H. LaBean* and J. H. Reif*
* Department of Computer Science, Duke University
† Department of Computer Science, Caltech
13
Abstract
DNA is a crucial construction material for molecular scale objects with nano-scale features.
Diverse synthetic DNA objects hold great potential for applications such as nano-fabrication,
nano-robotics, nano-computing, and nano-electronics. The construction of DNA objects is
generally carried out via self-assembly. During self-assembly, DNA strands are guided by their
sequence information into secondary structures to maximize Watson-Crick pairing of their bases
and thus minimize the free energy of the resultant structures. A crucial computational problem in
constructing DNA objects is the design of DNA sequences that can correctly assemble into
desired DNA secondary structures. However, existing software packages only provide
unintuitive text-line interfaces and generally require the user to step through the entire sequence
selection process, which could be time-consuming and tedious. In contrast, TileSoft described
here deliver the following features:
•
Its graphical user interface renders the molecular architect the ability to define DNA
secondary structure and accompanying designing constraints directly on the interface as
well as the ability to view the optimized sequence information pictorially.
•
Its fully automatic optimization module relieves the user of the drudgery of manually
dictating the sequence selection process, and its evolutionary algorithm produces
satisfactory results efficiently.
•
Its graphical user interface and its optimization module are smoothly integrated from
user's perspective, while they are at the same time well separated in terms of software
architecture, making each amenable to future improvements without negatively affecting
the other.
14
Motivation: Designing DNA tiles
DX lattice
•
•
•
•
Rhombus lattice
TX lattice
4x 4 lattice
Barcode lattice
DNA as nano-construction material
Self-assembly as bottom-up nano-construction method
DNA lattices made of DNA tiles, i.e. smaller DNA
secondary nanostructure units
Design process:
1.
2.
Template design
Sequence selection (optimization)
Tile template to be optimized
15
Prior work v.s. TileSoft
•
DNA word design software:
•
•
Sequin:
•
•
•
•
Produce a pool of DNA sequences such that each sequence is of maximal difference from
others
Generate DNA sequences that uniquely assemble into desired secondary structures
Text line interface for inputting template and displaying optimized sequences
Sequence optimization process only semi-automated
TileSoft: Available at http://www.cs.duke.edu/~py/dnaTileSoft/.
•
•
•
•
Generate DNA sequences that uniquely assemble into desired secondary structures
Graphical interface for inputting template and displaying optimized structure (GUI
Module)
Sequence optimization process fully automated (Optimization Module)
GUI Module and Optimization Module smoothly integrated for end users, yet well
separated in software architecture
16
GUI: Default window
17
GUI: Define Crossover
•
The user can define crossovers between helices, by clicking sequentially the
two bases to be connected in 5' to 3' order.
18
GUI: Set 5’ end; set 3’ end
Set 5’ end
•
Set 3’ end
By setting the 5' end and 3' end of a DNA strand, the user specifies the length of
the strand, and the unused segment of the strand is deleted automatically
(showed in color gray).
19
GUI: Edit base
•
The user can directly input the base values for a strand by typing; Typing more
than one character edits consecutive bases in the 5' to 3' direction along the strand
20
GUI: Set non-Waston-Crick base pairing
•
The user can define the subsequences that are not required to be Watson-Crick
base paired by clicking on the starting and ending bases of the subsequences.
21
GUI: Set and show EQ constraint
Set EQ constraint
Show EQ constraint
•
Set EQ constraint: Clicking on two bases in one strand defines the starting and
ending points of the first sub-sequence, and a click on another base delineates the
second sub-sequence with the same length and direction as the first one.
•
Show EQ Constraint: A small window is brought up that contains multiple buttons,
with each representing a set of equal sub-sequences. When one of these buttons is
clicked, the corresponding sub-sequences will be highlighted in purple.
22
Optimization module
•
The optimization module employs an evolutionary algorithm to find the best solution to
the optimization objective function.
•
Evolutionary algorithm:
•
An evolutionary algorithm maintains a population of DNA sequences, which are
generated randomly during initialization. During selection, the fittest DNA
sequences are chosen for reproduction, based on their score according to the
objective function. These individuals are used to generate new individuals via
mutations and crossovers, and the newly produced individuals are reinserted into
the population. The process is repeated until meeting some termination condition.
•
Objective function:
•
The objective function consists of two weighted factors, the count of unwanted
complementary sequences spurious matches (as the sequence-symmetry
minimization algorithm used in Sequin) and the count of to-be-avoided subsequences, e.g. long AT runs.
23
Future work
•
GUI:
•
Geometrically more flexible structures
•
Make the number of helices and number of bases per helix user specifiable
•
Optimization:
•
Multiple parallel traces of optimization process with different starting points
•
A pre-optimized library
•
Incorporate parameters such as hybridization temperature and software modules
such as BIND
•
A new heuristic that performs optimization based on existing pre-optimized duplex
libraries
•
Other:
•
Curvature analyzer (S. H. Park, Duke Physics )
•
Make the software more robust
4 x 4 Tile
A Lattice of 4 x 4 Tiles
QuickTime™ and a TIFF (U ncompressed) decompressor are needed to see this picture.
Uncorrugated 4 x 4 tile
QuickTi me™ and a TIFF (Uncompressed) decompressor are needed to see this picture.
27
Software for Designing and Simulating
DNA Nanorobotical Devices
John Reif, Sudheer Sahu, Peng Yin
Department of Computer Science, Duke University
28
DNA-Based Nano-Engineering: DNA and its Enzymes as
the Engines of Creation at the Molecular Scale
John H. Reif
Department of Computer Science, Duke University
29
Motivation-Device I-Device II-Device III-Conclusion
Motivation
DNA based nanorobotics devices
Rotation
Open/close
Open/close
Open/close
(Mao et al 99)
(Yurke et al 00)
(Simmel et al 01)
(Simmel et al 02)
Rotation
Extension/contraction
Extension/contraction
Extension/contraction
(Yan et al 02)
(Li et al 02)
(Alberti et al 03)
(Feng et al 03)
Motivation-Device I-Device II-Device III-Conclusion
Motivation
DNA nanorobotics
Rotation, open/close
extension/contraction
mediated by
environmental changes
Autonomous, unidirectional motion along an extended linear track
Kinesin
(R. Cross Lab)
Synthetic unidirectional DNA walker that moves autonomously
along a linear route over a macroscopic structure ?
(Recent work: non-autonomous DNA walking device by Seeman’s group,
autonomous DNA tweezer by Mao’s group)
30
31
Structural Components
• Design Part
• Sequence Optimization Part
• Simulation Part
32
I/O Specification for Design
• Input:
– Symbolic relation between various parts of
DNA sequences
• Output:
– Suitable restriction enzymes, sequences and
optimal values of experimental conditions
Design
Sequence Optimization
Simulation
33
Design Part
• 3-parameter model for restriction enzymes
– r,d,l
Design
Sequence Optimization
Simulation
34
Design Part
(continued….)
• All restriction enzymes can be expressed via these
parameters.
• The cleaving actions on the DNA strands in the
device are translated into these parameters, and as
output we get a list of all restriction enzymes that
can be used for the purpose.
Design
Sequence Optimization
Simulation
35
Example
•
r=6, d=16, l=14
•
Restriction enzymes that can be used
–
–
–
–
Acu I
Bpm I
BpuE I
Bsg I
Design
Sequence Optimization
Simulation
36
I/O for Sequence Optimization
• Input
– Topologies of system in various conformations
– Template sequences and constraints on them
• Output
– Exact optimized sequences for these template
sequences
Design
Sequence Optimization
Simulation
37
Procedure
• Find out the constraints for WC matches, for the given conformation.
• Assign bases randomly to degenerate bases.
• Calculate SCORE on the basis of:
– Count of bad oligos like GGG, GGGG, TTTT, AAAA
– Counts of complementary match regions (with and without one
mismatch) at intra-molecular, inter-molecular intra-complex and
inter-complex regions.
• The goal is to find an optimal structure that minimizes the score.
Design
Sequence Optimization
Simulation
38
Issues
• We can mutate one base at a time and evaluate the new
structure.
– Local minima.
• An evolutionary algorithm can be used for that with a
fitness function similar to the score function defined
earlier.
Design
Sequence Optimization
Simulation
39
Dynamic case
•
•
Number of conformations small
– Easy to extend this method (just add few more constraints)
Number of conformations large
– Difficult problem
– Consider one (or more) basic conformation(s) and the local changes in it
(them) and apply the method on this group as a whole
• C0 is initial conformation
• Ci is local change in conformation in going from Ci-1 to Ci
• Consider the group C0,C1,C2,…,Cn for the optimization as a whole.
Design
Sequence Optimization
Simulation
40
I/O for Simulation
• Input
– Initial conformation of the system
– Various external factors like
• temperature
• Na+, Mg++ concentrations
• Output
– Graphical display of the simulation
Design
Sequence Optimization
Simulation
41
Simulation Software
• DNA molecules represented as
– A string representing the sequence of 1st strand from 5’
to 3’ end.
– Another string representing the sequence of 2nd strand
from 3’ to 5’ end.
– Offset between the two sequences.
• With each DNA molecule we associate the
numerical value of its concentration.
Design
Sequence Optimization
Simulation
42
Assumptions
• Discrete time events.
• Restriction requires an exact match of the
recognition sites.
• For simplification, we assume that ligation
requires an exact compliment of the sticky
end.
Design
Sequence Optimization
Simulation
43
Ligation Events
• For every DNA molecule, calculate the probability
of it ligating with any other molecule, based upon
the concentration of the two molecules (only if
they have complementary sticky ends).
• Allow that ligation to occur with that probability.
Design
Sequence Optimization
Simulation
44
Restriction Events
• In every molecule, find out if there is any
restriction site, then depending upon the
activity of the restriction enzyme, find the
probability of that restriction.
• Allow that restriction to occur with that
probability.
Design
Sequence Optimization
Simulation
45
Simulation Steps
1.
2.
3.
4.
5.
Start with the initial configuration of the system.
Calculate the probability that next event will be a
ligation or a restriction.
Calculate the probabilities of various ligations and
restrictions.
Perform those events with the calculated probabilities.
Update the configuration of the system and the
concentrations of the molecules and goto step 2.
Design
Sequence Optimization
Simulation
46
Motivation-Device I-Device II-Device III-Conclusion
Example Design of DNA Walker
Restriction enzymes
PflM I
Walker
Anchorage
A*
B
Track
C
BstAP I
D
A
47
DNA Walker: Operation
• Valid hybridization:
A* + B = A + B* => A*B
C* + D = C + D* => C*D
• Valid cut:
A*B => A + B*
C*D => C + D*
Walker
B*C => B + C*
D*A => D + A*
Anchorage
A*
B
Track
B* + C = B + C* => B*C
D* + A = D + A* => D*A
C
D
A
48
DNA Walker: Operation
• Valid hybridization:
A* + B = A + B* => A*B
C* + D = C + D* => C*D
• Valid cut:
A*B => A + B*
C*D => C + D*
A*B
C
B* + C = B + C* => B*C
D* + A = D + A* => D*A
B*C => B + C*
D*A => D + A*
D
A
49
DNA Walker: Operation
• Valid hybridization:
A* + B = A + B* => A*B
C* + D = C + D* => C*D
• Valid cut:
A*B => A + B*
C*D => C + D*
C
A*B
B* + C = B + C* => B*C
D* + A = D + A* => D*A
B*C => B + C*
D*A => D + A*
D
A
50
DNA Walker: Operation
• Valid hybridization:
A* + B = A + B* => A*B
C* + D = C + D* => C*D
• Valid cut:
A*B => A + B*
C*D => C + D*
PflM I
A*B
C
B* + C = B + C* => B*C
D* + A = D + A* => D*A
B*C => B + C*
D*A => D + A*
D
A
51
DNA Walker: Operation
• Valid hybridization:
A* + B = A + B* => A*B
C* + D = C + D* => C*D
• Valid cut:
A*B => A + B*
C*D => C + D*
B* + C = B + C* => B*C
D* + A = D + A* => D*A
B*C => B + C*
D*A => D + A*
B*
A
C
D
A
52
DNA Walker: Operation
• Valid hybridization:
A* + B = A + B* => A*B
C* + D = C + D* => C*D
• Valid cut:
A*B => A + B*
C*D => C + D*
A
B* + C = B + C* => B*C
D* + A = D + A* => D*A
B*C => B + C*
D*A => D + A*
D
B*C
A
53
DNA Walker: Operation
• Valid hybridization:
A* + B = A + B* => A*B
C* + D = C + D* => C*D
• Valid cut:
A*B => A + B*
C*D => C + D*
A
B* + C = B + C* => B*C
D* + A = D + A* => D*A
B*C => B + C*
D*A => D + A*
D
B*C
A
54
DNA Walker: Operation
• Valid hybridization:
A* + B = A + B* => A*B
C* + D = C + D* => C*D
• Valid cut:
A*B => A + B*
C*D => C + D*
B* + C = B + C* => B*C
D* + A = D + A* => D*A
B*C => B + C*
D*A => D + A*
BstAP I
A
D
B*C
A
55
DNA Walker: Operation
• Valid hybridization:
A* + B = A + B* => A*B
C* + D = C + D* => C*D
• Valid cut:
A*B => A + B*
C*D => C + D*
B* + C = B + C* => B*C
D* + A = D + A* => D*A
B*C => B + C*
D*A => D + A*
C*
A
B
D
A
56
DNA Walker: Operation
• Valid hybridization:
A* + B = A + B* => A*B
C* + D = C + D* => C*D
• Valid cut:
A*B => A + B*
C*D => C + D*
A
B
C
B* + C = B + C* => B*C
D* + A = D + A* => D*A
B*C => B + C*
D*A => D + A*
D*A
57
DNA Walker: Operation
• Valid hybridization:
A* + B = A + B* => A*B
C* + D = C + D* => C*D
• Valid cut:
A*B => A + B*
C*D => C + D*
A
B
B* + C = B + C* => B*C
D* + A = D + A* => D*A
B*C => B + C*
D*A => D + A*
C*D
A
58
DNA Walker: Operation
• Valid hybridization:
A* + B = A + B* => A*B
C* + D = C + D* => C*D
• Valid cut:
A*B => A + B*
C*D => C + D*
B* + C = B + C* => B*C
D* + A = D + A* => D*A
B*C => B + C*
D*A => D + A*
D*
A
B
C
A
59
DNA Walker: Operation
• Valid hybridization:
A* + B = A + B* => A*B
C* + D = C + D* => C*D
• Valid cut:
A*B => A + B*
C*D => C + D*
B* + C = B + C* => B*C
D* + A = D + A* => D*A
B*C => B + C*
D*A => D + A*
A*
A
B
C
D
60
DNA Walker: Experimental Design
61
Autonomous Motion of the Walker
62
DNA Turing Machine: Structure
Turing machine
Transitional rules: Rule molecules
Turing head: Head molecules
Data tape: Symbol molecules
Autonomous universal DNA Turing machine: 2 states, 5 colors
Download