Self precharging pipeline

advertisement
Paper review:
High Speed Dynamic Asynchronous
Pipeline: Self Precharging Style
Name : Chi-Chuan Chuang
Date : 2013/03/20
Outline
• Introduction
• Dual-rail asynchronous pipelines
– PS0 pipeline
– Lookahead pipelines
•
•
•
•
•
Self precharging pipeline
Timing constraints
Simulation results
Conclusion
Advices
Introduction
• Asynchronous circuit’s functional blocks
communicate with handshake protocol
• Asynchronous pipeline’s advantages
– No global clock distribution problem
– No clock skew
– Lower power consumption
– Automatically adapt to the environments
Introduction (cont.)
• Asynchronous pipelines have two types
– Single-rail topology
– Dual-rail topology
• Single-rail topology
– Less area and wiring load
– But always takes the worst case delay and additional
timing margins
• Dual-rail topology
– More robust data depended completion detection
– Low throughput
Dual-rail asynchronous pipelines
• Williams’ PS0 pipeline
• Lookahead pipelines
– LP3/1 pipeline
– LP2/2 pipeline
– LP2/1 pipeline
• Enhanced lookahead pipelines
– Enhanced LP3/1 pipeline
– Enhanced LP2/1 pipeline
Dual-rail asynchronous pipelines (cont.)
PS0 pipeline
• Tcycle = 3Teval + 2Tcd + Tprech , Tfl = Teval
π“πœπ
precharge
π“πœπ
precharge
π“π©π«πžπœπ‘
π“πžπ―πšπ₯
π“πžπ―πšπ₯
π“πžπ―πšπ₯
LP3/1 pipeline
• Tcycle = 3Teval + Tcd + TNANDB
π“πœπ +𝐓𝐍𝐀𝐍𝐃𝐁
precharge
π“πžπ―πšπ₯
π“πžπ―πšπ₯
π“πžπ―πšπ₯
LP2/2 pipeline
• Tcycle = 2Teval + 2Tcd (wrong!)
πŸπ“πœπ
precharge
π“πžπ―πšπ₯
π“πžπ―πšπ₯
Asymmetric C element (aC)
LP2/1 pipeline
• Tcycle = 2Teval + Tcd + TNANDB (wrong!)
π“πœπ + 𝐓𝐍𝐀𝐍𝐃𝐁
precharge
π“πžπ―πšπ₯
π“πžπ―πšπ₯
Enhanced lookahead pipelines
• LP3/1 and LP2/1 pipelines have the problem
of higher wiring load and larger number of
inter-stage control signals
• It is difficult to communicate with the
environments
• Enhanced lookahead pipelines reduce wire
load but increase the cycle time
Enhanced LP3/1 pipeline
• Tcycle = 3Teval + Tcd + 2TNANDB
π“πœπ + πŸπ“ππ€ππƒπ
precharge
π“πžπ―πšπ₯
π“πžπ―πšπ₯
π“πžπ―πšπ₯
Enhanced LP2/1 pipeline
• Tcycle = 2Teval + Tcd + 2TNANDB
π“πœπ + πŸπ“ππ€ππƒπ
precharge
π“πžπ―πšπ₯
π“πžπ―πšπ₯
Comparison
• Reduce interstage control signal
• Communicate with environment
Self precharging pipeline
• Has all properties of a dual rail pipeline
• Each pipeline stage consists of
– A functional block with domino gates
– A completion detector
– An special asymmetric C-element
• The completion detectors are moved just after
the previous functional block
– interstage wiring load is reduced
Self precharging pipeline (cont.)
• Completion detector’s done signal is used to
precharge both the special aC and the
functional block, called self precharging
Special asymmetric C element
• This completion detector has lesser area,
delay and power consumption
Special asymmetric C element (cont.)
• Has two inputs coming from the CDs of
current stage N and next stage (N+1)
• It’s functionality
– When N+1 is 1 => output is 1
– When N+1 is 0 and N is 1 => output is 0
– Hold the previous value otherwise
Self precharging pipeline (cont.)
• Tcycle = 2Teval + Tcd + TaC , Tfl = Teval
π“πœπ + π“πšπ‚
precharge
π“πžπ―πšπ₯
π“πžπ―πšπ₯
Timing constraints
• Assume all stages, completion detectors and
aC are similar
• Three timing constraints
– Input hold-time:π‘‡β„Žπ‘œπ‘™π‘‘ ≤ 𝑇𝑐𝑑 + π‘‡π‘ŽπΆ + π‘‡π‘π‘Ÿπ‘β„Ž
– Precharge signal width:π‘‡π‘π‘Ÿπ‘β„Ž ≤ π‘‡π‘’π‘£π‘Žπ‘™ + 2𝑇𝐼𝑁𝑉
– Doesn’t have the safe takeover timing constraint
Simulation results
• Layout in 90nm UMC process, at 1.2V supply,
temperature is 300K, normal process corner
Simulation results (cont.)
• Power and area
– Enhanced LP3/1 has the highest power
consumption and area, followed by PS0
– LP2/2 has the lowest power consumption and area
– Enhanced LP2/1 and SP have almost same area and
power consumption (but SP slightly higher)
Conclusion
• Self precharging protocol
• CDs are just placed after the previous stage
• aC removes the self takeover timing constraint of
the LP family, makes it simpler to design
• High throughput (2.227G data items/s)
• Area and power consumption are comparable
with LP2/1 pipeline
• low latency, high robustness, low power,
avoidance of explicit latches etc. compared with
synchronous counter parts
Comments
• Tcycle of LP2/1 and LP2/2 are wrong in [14]
• Maybe including the results of power
consumption and area
• Describe more detail about timing constraints
Thanks for your attention
Download