Nanotechnology: Spatial Computing Using Molecular Electronics Mihai Budiu joint work with Seth Copen Goldstein Dan Rosewater Intersection of Three Areas Reconfigurable computing Nanotechnology Computer architecture SSS April 20, 2001 2 Prophecies, A Risky Endeavor There is no reason anyone would want a computer in their home. --- Ken Olson I think there is a world market for maybe five computers. --- T. J. Watson There is not the slightest indication that nuclear energy will ever be obtainable. --- Albert Einstein 640K ought to be enough for everybody. --- Bill Gates I will propose this semester. --- Anonymous SSS April 20, 2001 3 Moore’s Law SSS April 20, 2001 4 X 1000$ Moore’s Second Law generation Plant cost SSS April 20, 2001 Mask cost 5 Our Proposal Nanotechnology + cheap + high-density + low-power – unreliable Reconfigurable Computing + defect tolerant + high performance – low density _ _ _ ++++ + + _ Computer architecture + vast body of knowledge – expensive – high-power SSS April 20, 2001 6 Paradigm Shift Executable Complex fixed chip + Program SSS April 20, 2001 Configuration Dense, regular structure + Configuration 7 Outline • • • • • Introduction Reconfigurable computing Nanotechnology Nano-architecture proposal Preliminary results Conclusions and Future Work SSS April 20, 2001 8 Reconfigurable Computing • Back to ENIAC-style computing • Synthesize one machine to solve one problem SSS April 20, 2001 9 Island-Style RC Architecture Interconnection network Universal gates and/or storage elements Programmable Switches SSS April 20, 2001 10 Main RC Ingredient: RAM Cell a0 a1 0 0 0 1 data a0 a1 a1 & a2 Universal gate = RAM data in 0 control Switch controlled by a 1-bit RAM cell SSS April 20, 2001 11 Place and Route int reverse(int x) { int k,r=0; for (k=0; k<64; k++) r |= x&1; x = x >> 1; r = r << 1; } } int func(int* a,int *b) { int j,sum=0; for (j=0; *a>0; j++) sum+=reverse(*b SSS April 20, 2001 12 Times Over 300Mhz UltraSparc-II Kernel Speedup Using PipeRench 1000 189.7 100 63.3 57.1 42.4 26.0 15.5 11.3 29.0 12.0 10 1 SSS April 20, 2001 13 Defect Tolerance Despite having >70% of the chips defective, Teramac works flawlessly. Compilation has two phases: • defect detection through self-testing • placement for defect-avoidance SSS April 20, 2001 14 Outline • • • • Introduction Reconfigurable computing Nanotechnology Nano-architecture proposal Preliminary results Conclusions and Future work SSS April 20, 2001 15 Nanotechnology SSS April 20, 2001 16 Predicted Features • Low Power: 1010 gates use less than 2 W (compare to 3x107 transistors using 100 W in CMOS) • Low cost (nanocents/gate) • Small size (105 factor area gain) Nano-RAM cell . In yellow: a CMOS RAM cell SSS April 20, 2001 17 Nano-wires • carbon nanotubues, Si, metal • >2nm diameter, up to mm length • excellent electrical properties A carbon nanotube: one molecule SSS April 20, 2001 18 Nano-switch SSS April 20, 2001 19 Nano-switch Between Nano-wires SSS April 20, 2001 20 Self-assembly SSS April 20, 2001 21 No Complex Irregular Structures SSS April 20, 2001 22 No Three-Terminal Devices SSS April 20, 2001 23 Diode-resistor Logic V AND A B VDD A B Input 1 A *^ B Output Input 2 A*B Nano-implementation SSS April 20, 2001 Electrical equivalent 24 Nanoscale Latches Provide: • signal restoration (amplification) • clocking (synchronization) • memory data out D clock SSS April 20, 2001 25 High Defect Rate SSS April 20, 2001 26 Outline • • • Introduction Reconfigurable computing Nanotechnology Nano-architecture proposal Preliminary results Conclusions and future work SSS April 20, 2001 27 The nanoBlock (3-in to 3-out Logic) CMOS Inputs +Vdd Gnd clk clk SSS April 20, 2001 Gnd Outputs 28 Interconnecting nanoBlocks Switch block SSS April 20, 2001 29 Global View SSS April 20, 2001 30 Many Clusters = nanoFabric cluster Control long-lines SSS April 20, 2001 31 Compilation 1. Program 2. Split-phase Abstract Machines int reverse(int x) { int k,r=0; for (k=0; k<64; k++) r |= x&1; x = x >> 1; r = r << 1; } } Computations & local storage Unknown latency ops. 3. Configurations placed independently 4. Placement on chip SSS April 20, 2001 32 Outline • • Introduction Reconfigurable Hardware Nanotechnology Nano-architecture proposal Preliminary results Conclusions and Future work SSS April 20, 2001 33 A Limit Study of Performance A graph of the whole program execution: Basic block Control-flow transfer Memory write Memory read SSS April 20, 2001 Memory word 34 SSS April 20, 2001 ep _e m pc _d m pc li eg ij p 2. 0. ic _e g7 21 _Q _d g7 21 _Q _e gs m _d gs m _e jp eg _d jp eg _e m pe g2 _d ad ad 13 13 s go es pr 9. 09 co m 9. 12 units Area (106 units/cm2 available) 250000 memory area code area 200000 150000 100000 50000 0 35 Typical Program Graph (g721_e) Memory reads Control flow transfer 100% code cluster 100% memory cluster SSS April 20, 2001 36 Typical Program Graph (g721_e) Memory reads Control flow transfer code memcpy memory SSS April 20, 2001 37 Program Graph After Inlining memcpy memcpy SSS April 20, 2001 38 -1 SSS April 20, 2001 _d g_ e g_ d pe g2 m jpe jpe _e m 1 clock/square gs _e _d e _d m _Q _Q gs g7 21 g7 21 9 ep i c_ 13 0. li 13 2. ijp eg ad pc m _d ad pc m _e 09 12 9. 9. go co m pr es s times slower than native Application Slowdown 11 10 5 clocks/square 8 7 6 5 4 3 2 1 0 39 How Time Is Spent No caches: reads expensive 100% 90% 80% percent 70% 60% 50% 40% 30% idle execution control flow register traffic 20% 10% 09 12 9 9. co .g o m pr es s 13 13 0.li 2. ijp ad e g pc m ad _d pc m _e ep g7 ic_e 21 _Q g7 _ 21 d _Q _e gs m _d gs m _e jp eg _d jp eg _e m pe g2 _d 0% SSS April 20, 2001 No speculation 40 Future Work • Better nano-devices • More accurate hardware models in simulations • Compilation technology SSS April 20, 2001 41 Conclusions • Electronic nanotechnology promises to transcend the limitations of CMOS • Nanofabrics are very well suited to reconfigurable computation • 109-gate designs can be managed through hierarchies of abstract machines SSS April 20, 2001 42