Avogadro Scale Engineering Physics of Information Technology MIT – Spring 2006 PART II

advertisement
Avogadro Scale Engineering
‘COMPLEXITY’
Physics of Information Technology
MIT – Spring 2006
PART II
jacobson@media.mit.edu
Homework
•
I] Nanotech Design:
Find an error function for which it is optimal to divide a
logic area A into more than one redundant sub-Areas.
• II] Design Life:
(a) Design a biological system which self replicates with
error correction (either genome copy redundancy with
majority voting or error correcting coding). Assume the
copying of each nucleotide is consumptive of one unit of
energy. Show the tradeoff between energy consumption
and copy fidelity.
(b) Comment on the choice biology has taken (64 -3
nucleotide) codons coding for 20 amino acids. Why has
biology chosen this encoding? What metric does it
optimize? Could one build a biological system with 256 –
4 bit codons?
Questions: jacobson@media.mit.edu
Scaling Properties of Redundant Logic (to first order)
P
A
Probability of correct functionality = p[A] ~ e A (small A)
Area = A
P1 = p[A] = e A
P2 = 2p[A/2](1-p[A/2])+p[A/2]2
Area = 2*A/2
= eA –(eA)2/4
Conclusion: P1 > P2
Designing Life
Fault Tolerant
Redundancy
Error Correcting
Fault Tolerant
Other Coding (e.g. Parity)
Error Correcting
Designing Life
I] Fault Tolerant Redundancy
Gene1 Gene2 Gene3 Gene1 Gene2 Gene3
1.
Replicate Linearly with Proofreading and Error Correction
Fold to 3D Functionality
Error Rate:
1: 106
100 Steps
per second
template dependant 5'-3'
primer extension
3'-5' proofreading
exonuclease
Beese et al. (1993), Science, 260, 352-355.
http://www.biochem.ucl.ac.uk/bsm/xtal/teach/repl/klenow.html
5'-3' error-correcting
exonuclease
MutS Repair System
Approach 1b] Redundant Genomes
Deinococcus radiodurans
(3.2 Mb, 4-10 Copies of Genome )
[Nature Biotechnology 18, 85-90
(January 2000)]
D. radiodurans:
E. coli:
Uniformed Services University of
the Health
1.7 Million Rads (17kGy) – 200 DS breaks
25 Thousand Rads – 2 or 3 DS breaks
http://www.ornl.gov/hgmis/publicat/microbial/image3.html
Combining Error Correcting Polymerase and
Error Correcting Codes One Can Replicate a
Genome of Arbitrary Complexity
N
Basic Idea:
M
M strands of N Bases
Result: By carrying out a consensus vote one requires
only
M  ln N 
To replicate with error below some epsilon such that
the global replication error is:
PE  
M  PE  
30
25
20
15
10
100
200
300
400
N (Genome Length)
500
II] Coding
mRNA
Ribosome
Amino Acid
4 Base Parity Genetic Code
Let A=0, U,T=1, G=2, C=3
Use 3+1 base code
XYZ Sum(X+Y+Z, mod 4)
Leu: UUA -> UUAG
http://schultz.scripps.edu/Research/UnnaturalAAIncorporation/research.html
Error Correction in Biological Systems
Fault Tolerant Translation Codes (Hecht):
NTN encodes 5 different nonpolar residues
(Met, Leu, Ile, Val and Phe)
NAN encodes 6 different polar residues
(Lys, His, Glu, Gln, Asp and Asn)
Local Error Correction:
Ribozyme: 1:103
Error Correcting Polymerase: 1:108 fidelity
DNA Repair Systems:
MutS System
Recombination - retrieval - post replication repair
Thymine Dimer bypass.
Many others…
E. Coli Retrieval system - Lewin
Biology Employs Error Correcting Fabrication + Error Correcting Codes
Physics of Information Technology
MIT – Spring 2006
4/10
1] Von Neumann / McCullough/Winograd/Cowan
Threshold Theorem and Fault Tolerant Chips
2] Simple Proofs in CMOS Scaling and Fault Tolerance
3] Fault Tolerant Self Replicating Systems
4] Fault Tolerant Codes in Biology
4/24
1]Introduction of the concept of Fabricational Complexity
2]Examples, numbers and mechanisms from native biology: error correcting
polymerase and comparison to best current chemical synthesis using protection
group (~feedforward) chemistry.
3]Examples from our error correcting de novo DNA synthesis (with hopefully a
demo from our DNA synth simulator)
4]Error correcting chip synthesis
5]Saul's self replicating system with and without error correction
Fabricational Complexity
•Total Complexity
•Complexity Per Unit Volume
•Complexity Per Unit Time*Energy
•Complexity Per unit Cost
Ffab = ln (W) / [ a3 tfab Efab ]
Ffab = ln (M)-1 / [ a3 tfab Efab ]
Fabricational Complexity
Total Complexity Accessible to a Fabrication Process with
Error p per step and m types of parts:
70

FFAB   p ln m
n
60
n
50
40
30
n 1
20
10
p
A
2
p
3
p
G
A G
T
A T
A G T
200
C
A C
G T …
A G C …
Complexity Per Unit Cost:
For given complexity n*:
f FAB  p ln m / C
n*
Where C is cost per step
400
600
800
1000
Fabricational Complexity
Non Error Correcting:
f FAB  p ln m / C
n*
A G T C
Triply Error Correcting:

A G T C
f FAB3  3 p (1  p)  p
2

*
3
n
3
ln m / 3C
A G T C
A G T C
140
P = 0.9
120
f FAB3
f FAB
0.25
100
80
0.2
60
0.15
40
0.1
20
0.05
50
100
150
n
200
250
300
n = 300
P = 0.85
0.3
f FAB3
f FAB
3000
2500
2000
1500
1000
500
50
100
n
150
200
0.86
0.88
0.92
0.94
p
0.96
0.98
Resources for Exponential Scaling
Resources which increase the complexity of a
system exponentially with a linear addition of
resources
1] Quantum Phase Space
2] Error Correcting Fabrication
3] Fault Tolerant Hardware Architectures
4] Fault Tolerant Software or Codes
Fabricational Complexity
Genome
(Natural)
Design Rule Smallest Dimension
(microns)
0.0003
Number of Types of Elements
4
Area of SOA Artifact (Sq. Microns)
NA
Volume of SOA Artifact (Cubic Microns)
6.E+01
Number of Elements in SOA Artifact
3.E+09
Volume Per Element(Cubic Microns)
2.E-08
Fabrication Time(seconds)
4.E+03
Time Per Element (Seconds)
1.E-06
Fabrication Cost for SOA Artifact($)
1.E-07
Cost Per Element
3.E-17
Complexity
4.E+09
Complexity Per Unit Volume of SOA(um^3) 7.E+07
Complexity Per Unit Time
1.E+06
Complexity Per Unit Cost
4.E+16
Cost Per Area
NA
Gene Chip
(Chemical SemiParallel
conductor
Synthesis) Chip
0.0003
4
7.E+08
5.E+06
7.E+04
8.E+01
2.E+04
3.E+02
1.E+02
2.E-03
9.E+04
2.E-02
6.E+00
9.E+02
2.E-07
0.1
8
7.E+10
7.E+09
7.E+12
1.E-03
9.E+04
1.E-08
1.E+02
2.E-11
2.E+13
2.E+03
2.E+08
1.E+11
2.E-09
High
Speed
Offset
Web
10
6
2.E+12
2.E+12
2.E+10
1.E+02
1.E-01
7.E-12
1.E-01
6.E-12
4.E+10
2.E-02
3.E+11
3.E+11
6.E-14
TFT
2
8
1.E+12
1.E+11
3.E+11
4.E-01
7.E+02
2.E-09
2.E+03
6.E-09
6.E+11
5.E+00
9.E+08
3.E+08
2.E-09
Liquid
DVD-6 Embossing
0.25
2
1.E+10
7.E+12
2.E+11
4.E+01
3
2.E-11
3.E-02
2.E-13
1.E+11
2.E-02
4.E+10
4.E+12
3.E-12
0.2
4
8.E+09
8.E+08
2.E+11
4.E-03
6.E+01
3.E-10
2.E-01
1.E-12
3.E+11
3.E+02
5.E+09
1.E+12
3.E-11
…Can we use this map as a guide towards future
directions in fabrication?
1.
Replicate Linearly with Proofreading and Error Correction
Fold to 3D Functionality
Error Rate:
1: 106
100 Steps
per second
template dependant 5'-3'
primer extension
3'-5' proofreading
exonuclease
Beese et al. (1993), Science, 260, 352-355.
http://www.biochem.ucl.ac.uk/bsm/xtal/teach/repl/klenow.html
5'-3' error-correcting
exonuclease
DNA Synthesis
Caruthers Synthesis
Error Rate:
1: 102
300 Seconds
Per step
http://www.med.upenn.edu/naf
/services/catalog99.pdf
Avogadro Scale Engineering
Molecular Machine (Jacobson) Group – MIT - May, 2005
Gene
Level
Error
Removal
Nucleic Acids Research 2004
32(20):e162
Error Rate 1:104
In Vitro Error Correction Yields
>10x Reduction in Errors
Nucleic Acids Research 2004
32(20):e162
Error Reduction: GFP Gene synthesis
Nucleic Acids Research 2004
32(20):e162
Autonomous self replicating machines from
random building blocks
HOMEWORK – DUE 5/1/06
1] Consider biological cells which are able to copy their genome using appropriate
pieces of molecular machinery (e.g. polymerase). Assume that the total probability of
correctly copying each nucleotide is p=.999 per nucleotide. Calculate the Total
Fabrication Complexity accessible to this system assuming that there are 4 types of
nucleotides (i.e. A,G,C,T). Now assume that we have created a new type of cell which
has a genome possessing six different types of nucleotides (i.e. A,G,C,T,X,Y). If we
assume that we wish to keep the total Fabricational Complexity the same what must the
probability per nucleotide addition, p, now be?
2] Consider now the fabricational complexity per unit cost f. Calculate the threshold
probability p for which it is advantageous to use a redundant error correction scheme
(such as trible redundancy) and majority voting than no error correction. Into which
regime does biology fall?
Download