Discussion States

advertisement
OCIECE Winter 2002
High-Speed Low-power VLSI
New Single-Clock CMOS Latches and FlipFlips with improved
Speed and Power [2]
&
A New True-Single-Phase-Clocked Double-Edge-Triggered
Flip-Flop for Low-Power VLSI Designs [7]
Discussion States
•
Introduction.
•
Motivation for TSPC and DET Flip-Flops.
•
New techniques for high-speed TSPC and single clocked FlipFlops and latches.
•
A New technique for TSPC Dual-edge-clocked Flip-Flop.
•
A new approach for Power Consumption comparative analysis of
Single Edge Triggered /Dual Edge Triggered Flip-Flops.
•
Results and comparisons
•
Conclusions
•
Questions
Static vs. Dynamic Latches
•
A Static Latch:
•
A cross-coupled inverter pair produces a bistable element. The
bistable states are used to memorize binary data as long as the
supply voltage exists [10].
•
Another signal(s) (Clock) is/are used to allow transparency or notransparency between the I/P and O/P of the bistable element.
D
Q
Clk
•
A Dynamic Latch:
•
Temporary storage of a charge in the parasitic capacitors of a circuit
that is periodically refreshed . The stored charge is used to memorize
binary data [10].
•
A Clock signal is used to allow transparency, partial transparency or
no-transparency between the I/P and O/P of the circuit.
Static, Semi-static and Dynamic Flip-Flops
•
Static (Non-precharged) Flip-Flop: a cascaded
pair of static latches clocked in a complementary style.
•
Semistatic Flip-Flop: a cascaded
pair of static and dynamic latches
clocked in a complementary style.
•
D
Q
Clk
Fully dynamic Flip-Flop: a cascaded
pair of dynamic latches clocked in a
complementary style.
•
Differential Flip-Flop: a cascaded pair of
a static, dynamic or mix of differential
latches clocked in a complementary style. It
deals with differential inputs and outputs.
D
Q
D
Q
Clk
Power Consumption in the Clocking
System of a Synchronous Pipelined circuit
Ckext
Clock
Generator
Data
D
CL1
D
F/F
Cg.1
Cj
Q
C
D
Q
CLn
D
F/F
F/F
C
Cgw
Q
F/F
Cg.n
C
Clw.1
Q
C
Clw.n
Ckint
•
Total power consumption of the clocking system (Pck) can be expressed as:
Pck = Pckgen + Pgw + Plw + Pg + Preg
= ( 1 + 1/ k + 1/k2 + ......... + 1/kn ) . Vs 2 . f . [ Cj + Cgw + Clw + Cg ] + Preg ......... (eqn 1)
where, Pgw : Power consumption of global wiring capacitances
Plw : Power consumption of local wiring capacitances
Pg : Power consumption due to total clocked gate capacitance
Preg : Power consumption in the flip-flops,
k : Tapering factor of the clock generator (clock buffer)
Cg : Total gate capacitance of clocked transistors of the clocking system, and
Cj : junction cap. of the source-drain regions of the output node of the clock
generator
Power Consumption in the Clocking
System of a Synchronous Pipelined circuit
•
Pck = 1.11 . Vs 2 . f . [Cj + Cgw + Clw + m . Cgf ] + m . Pgf .......... (eqn 2)
where m : total number of flip-flops in the clocking system
Cgf : Total clocked gate capacitance per flip-flop
Pgf : Power consumption of one flip-flop @ frequency f
Therefore, the power consumption of the clocking system can be minimized by:
1.
Reducing Cgw and Clw
E.g. Using True Single Phase clocking techniques (TSPC).
2.
Reducing the number of clocked transistors per flip-flop ( Cgf term)
E.g. Using Single Transistor Clocked FF (STC).
3. Utilizing the two edges of the clock
e.g. (Dual Edge Triggered FF).
4. Using minimum transistor size, and careful layout design to reduce Pgf .
TSPC Basic stages
F
F
F
F
•
F
F
PP
PN
SP
SN
Basic (non-differential) TSPC Edge triggered FF are :
•
Pre-charged version (dynamic) PP+SP+PN+SN
•
Non-pre-charged (static) SP+SP+SN+SN
Low power bottlenecks: 1. clocked transistors with a = 1.
2. Pre-charged nodes with a = 0.5.
Speed bottlenecks :
P-blocks to provide
complementary
inputs to nblocks. (2 stages +Inv.)
delay
Precharged Single stage TSPC Full-latch
(PN+SN) + (PP+SP+INV)
Advantages:
1- High and low input total latching.
2- Isolated output node. (Infinite impedance).
3- The precharge state is not showing on the O/P
at the start of the evaluation.
4- No need for a stable input.
F
F
In
F
D
F
F
PN
D
F
SN
PP
SP
NMOS transistor sizing is important in
order not to load the pre-charged node.
F
In
F
F
F
(PN + SN) + FL(P) + INV
D
D
F
In
F
F
F
D
D
(PN/SN) + FL(P) + INV
Nonprecharged Single stage TSPC Fulllatch
•
•
N-latch to become PSN.
I
n
Charge sharing
compensation.
F
F
SN
•
Only 3 clocked transistors in
the PSLT.
•
Essential need for a stable
input.
In F
F
F
D
D
SN
F
SP
SP
F
In
F
*
F
D
D
(PSN+SN) + FL(P) + INV
PSLT(N) + FL(P) + INV
D
D
(SN + SN) + (SP + SP + INV)
F
*
F
Non-Differential semistatic flip-flops
F
Vdd
Conflict-free Full latch
•
Transistors in the
dotted box should be
kept to a minimum size
to minimize the load.
F
*
In
•
Vdd
Vdd
*
F
D
F
D
*
F
PN + SN + SFL(P) + INV
Vdd
F
In
F
Vdd
Vdd
F
*
*
Simplified version
F
D
D
CVSL and Single-Transistor-Clocked TSPC
dynamic and static differential latches
•
Availability of complementary
outputs.
•
F
In
Out
In
In
Can Not be recovered in
DSTC1.
•
Ou
t
F
STC charge sharing problems
•
n-latch
p-latch
In
F
F
Out Out
Can be recovered in SSTC1.
F
In
Out
Out
In
In
In
Out
Out
Out
F
*
*
In
F
Out
In
In
*
*
In
F
Out Out
DSTC1(P)
DSTC1(N)
SSTC1(P)
SSTC1(N)
High speed differential Flip-Flops
•
An opportunity to remove the speed bottleneck using
the following principle;
•
“The master does not have to be a full-latch and it is enough
to have only one isolated output state as long as it is identical
to the nontransparent input state of the slave.” [2]
In
In
D
F
*
*
F
*
*
*
*
SP
D
F
In
In
F
*
*
F
*
*
*
SP
DSTC1(N)
Dynamic Flip-Flop
( (SP + SP) + DSTC1(N) )
SP
D
D
*
SP
F
SSTC1(N)
Semistatic Flip-Flop
( (SP + SP) + SSTC1(N) )
A high-speed dynamic differential Flip-Flop (cont’d)
In
•
P-block is too fast, so n-transistors
need to be minimized to give enough
setup time.
D
In
F
F
*
*
D
*
*
*
F
*
•
SP stages are merged to
remove more P-transistors.
•
A correspondence showed an
input glitch related problem
Positive edge-triggered dynamic
( (SP + SP) + DSTC1(N) )
DSTC2(P)
•
Capacitive coupling (x-D).
Charge sharing (bj - D)
D
F
leads to a poor logic-level
zero output.
•
DSTC1(N)
D
In
In
x*
*
*
*
x
bj
F
A high-speed dynamic differential Flip-Flop (cont’d)
Whenever a pipeline is ended with a DSTC2(p), a termination stage is needed
to fully latch the output [2].
D
F
*
I
n
F
*
D
*
*
In
DSTC1(N)
DSTC2(P)
Only used for
p-termination
of a pipeline
A high-speed fully static differential Flip-Flop
•
DSTC2 p-latch to be converted to SSTC2(P):
•
A minimum inverter and two minimum n-transistors to low
output from floating to high.
D
D
F
*
In
In
*
*
*
*
*
*
*
F
In
SSTC2(P)
SSTC1(N)
Performance Comparison [2]
Conventional Dual Edge Triggered (DET)
Flip-Flops
VDD
VDD
F
D
F
F
F
D
F
F
F
F
GND
VDD
F
F
Q
Q
GND
VDD
F
F
F
F
F
GND
GND
Fig 1
FEATURES OF DIFFERENTIAL EDGE-TRIGGERED
FLIP-FLOPS
F/Fs
s/d
f
Nt
Nc
Pcg.n Pw.n Pg.n
SET
s
1
9/11
4
>1
>1
>1
Fig 1 d
1/2
18/20 8
>1
1
>1
Fig 2 d
1/2
20/22 6
>1
1
>1
Fig 3 d
1/2
12/14 4
1
1
1
Fig 4 d
1/2
16/18 4
1
1
1
Fig 2
F
F
D
F
F
Q
F
Fig 3
F
A new TSPC Dual Edge Triggered (DET)
Flip-Flop
•
The TSPC DET leads to less Pw.
•
Less number of clocked transistors => less Cgf
•
Less number of speed bottleneck devices.
•
No PP/PN stages leads to less activity in the internal nodes.
VDD
F
F
VDD
D GND
F
F
GND
Fig 4
FEATURES OF DIFFERENTIAL EDGE-TRIGGERED
FLIP-FLOPS
F/Fs
s/d
f
Nt
Nc
Pcg.n Pw.n Pg.n
SET
s
1
9/11
4
>1
>1
>1
Fig 1 d
1/2
18/20 8
>1
1
>1
Fig 2 d
1/2
20/22 6
>1
1
>1
Fig 3 d
1/2
12/14 4
1
1
1
Fig 4 d
1/2
16/18 4
1
1
1
Dual Edge Triggered (DET) Flip-Flop
Speed calculations
Data
SPEED/ POWER PERFORMANCE
OF EACH EDGE-TRIGGERED FLIP FLOPS
F/Fs
SET
Fig 1
Fig 2
Fig 3
Fig 4
D
Q
D
Q
CL
tcq
tsu
t
Cgf
Pgf
Cff(fF) Cff/Cgf
0.36n
0.43n
0.70n
0.53n
0.66n
0.34n
0.34n
0.30n
0.33n
0.43n
0.70n
0.77n
1.00n
0.86n
1.09n
6.88f
13.76f
10.32f
6.88f
6.88f
32.46
26.02
29.92
125.4
23.38
25.97
41.63
47.86
200.6
37.41
3.77
3.03
4.64
29.16
5.44
DEFTFF
C
DEFTFF
C
Clock
Clock
Data
Q
tsu2
tsu1
tcq2
tcq2
•
Setup time: tsu = max (tsu1 + tsu2)
•
Clk-to-output delay: tcq = max (tcq1 + tcq2)
•
The speed figure of the DET FF: t = tsu + tcq
•
Max. toggle frequency of the DETFF (@25MHz) = 1 / (2 * speed figure) =
500MHz
Power consumption in the clocking system
associated with a SET/DET Flip-Flops
•
Pgf ( power consumption per flip-flop @ frequency f ) = Cff .Vdd 2 . f .....
(eqn 3)
where,
Cff = Equivalent capacitance of one flip-flop ,
Pck = 1.11 . Vs 2 . f . [ Cj + Cgw + Clw + m . Cgf ] + m . Cff . Vdd 2 . f
= Vdd 2 . f ( Cclk + Creg ) , assuming for Vs = Vdd
= Vdd 2 . f . Ctot
•
A pipelined FIR macro (No. of FF’s = 2616) was implemented and a
comparative analysis was done showing a power reduction of 36%
A TYPICAL POWER CONSUMPTION EXAMPLE
compared to
the
SETFF implementation.
OF
DIFFERENT
CLOCKING SYSTEMS
F/Fs
f(M)
Cg
Cclk
Creg
Ctot
Pck
(W)
Pck.n
norm.
P.t
(fJ)
SET
Fig 1
Fig 2
Fig 3
Fig 4
50
25
25
25
25
18
36
27
18
1
37.7
57.7
47.7
37.7
.
67.9
108.9
125.9
524.9
.
105.6p
166.6p
172.9p
562.6p
135.6p
0.132
0.104
0.108
0.352
0.085
1.00
0.79
0.82
2.67
0.64
35.3
30.6
41.3
115.7
35.4
A new approach for Power Consumption
analysis of SET/DET Flip-Flops [5]
In the absence of Glitches;
PD =
1
2
X
D
D
fDV 2 DD cjaj
Q
Clk
aj = transition probability@ node j
Z
D
D
Power consumption of the clock nodes
Pck = fckV 2 DDCk
PSET = fV
1
2
1
2
2
Clk
1
For equal data rate ,
Ck , SET  fV
DD
0
Q
1
2
2
DD
D
 aj ,SETCj ,SET
Q
Y
j
fV 2 DDCk , DET  12 fV 2 DD aj , DETCj , DET
j
PSET = fV
PDET =
Q
j
Cj = the node capacitance.
PDET =
Q
2
Ck , SET  a
DD
1
D
2
fV
2
Ca , SET
DD
fV 2 DDCk , DET  aD 12 fV 2 DDCa , DET
PDET
PSET
=
Ck , DET  aDCa , DET
2Ck , SET  aDCa , SET
Q
A new approach for Power Consumption
analysis of SET/DET Flip-Flops [5]
Input Glitch Analysis:
D
X
D
Q
Q
Q
Pglitch, SET = fV 2 DDCk , SET  bfV 2 DDCb , SET where,
Clk
b = 0 -> 1 -> 0 average transitions between two active clock edges.
Cb , SET = CD, SET  12 CX , SET
Pglitch, DET =
1
2
Per one glitch transistion.
fV 2 DDCk , DET  bfV 2 DDCb , DET Where,
Z
D
D
Cb , DET = CD, DET  CX , DET
Per one glitch transition. Clk
1
D
Pglitch, DET
Pglitch, SET
=
0
Q
Ck , DET  2 bCb , DET
2Ck , SET  2 bCb , SET
Q
Y
Q
Conclusions [2],[7]
•
DET flip-flops is more sensitive to signal glitches wrt SET FF.
•
In case of applications that have low transistion probability of
the input signals and reduced glitching, the power savings using
DET instead of SET can be significant. (Up to 36%) according
to the new proposed TSPC FF.
The proposed DETFF has a very challenging power-delay
product performance. [12]
•
•
With the trend of increasing Clock frequency, it will be
increasingly difficult to control both edges of the clock in the
clock distribution system.
•
SSTC1, SSTC2, DSTC1, DSTC2 based Flip-flops showed
superior improvement in delays by factors of 2.2 and 2.4 in the
Semistatic and fully static styles resp.
•
PDP reduced by factors of 3.4 and 6.5 for (a = 0.25) in the
Semistatic and fully static styles resp.
Conclusions [2],[7]
•
New proposed high-speed flip-flops with logic related transistors
are purely n-type in both n-latches and p-latches which gives
the speed advantage to this approach in designing flip-flops.
References
[1]
[2]
[3]
[4]
[5]
[6]
J. Yuan and C. Svensson, “High-Speed CMOS Circuit Technique,”
IEEE J. Solid-State Circuits, vol. 24, no.1, pp.62-70, Feb. 1989.
J. Yuan and C. Svensson, “New Single-Clock CMOS Latches and
Flipflops with Improved Speed and Power Savings,” IEEE
J. Solid-State Circuits, vol. 32, no.1, pp.62-69, Jan. 1997.
W. M. Chung and M. Sachdev, “A Comparative Analysis of Dual Edge
Triggered Flip-Flops,”
R.P. Llopis and M. Sachdev, “Low Power, Testable Dual Edge
Triggered FlipFlops”, International on Low Power Electronics and
Design, 1996, pp.341-5.
A.G.M. Strollo, E. Napoli, and C. Cimino, “Analysis of Power
Dissipation in Double Edge-Triggered Flip-Flops,” IEE Trans. On
VLSI Systems, Vol. 8, no.5, Oct. 2000.
S.M.M. Mishra, S.S.Rofail and K.S.Yeo, “Design of High Performance
Double Edge-Triggered Flip-Flops,” IEEE Proc. Circuits Devices Syst.,
Vol.147, no.5, Oct. 2000.
References
[7]
[8]
[9]
[10]
[11]
J.S.Wang, “A New True-Single-Phase-Clocked Double-Edge-Triggered
Flip-Flop for Low-Power VLSI Designs,” IEE Int’l Symposium on Circuits
and Systems, pp.1896-1899 June 9-12, 1997.
G.M.Blair, Comments on “New Single-Clock CMOS Latches and Flip-flops
with Improved Speed and Power Savings,” IEEE J. Solid-State Circuits,
vol.. 32, no.10, pp.1610-1611, Oct. 1997.
V.Stojanovic and V.G.Oklobdzija, “Comparative Analysis of Master-Slave
Latches and Flip-Flops for High-Performance and Low-Power Systems,”
IEEE J. Solid-State Circuits, vol.. 34, no.4, pp. 536-548, April. 1999.
M.Afghani and J. Yuan, “Double-edged-triggered D-Flip-Flops for high
speed CMOS circuits,” IEEE J. Solid-State Circuits, vol.SC-26, no.8,
pp.1168-1170, Aug. 1991.
Jan N. Rabaey, “Digital Integerated Circuits: A Design Perspective”, 1996
Questions?
The Bistability principle
[10]
• The width of the trigger pulse needs to be larger than
the total propagation delay around the circuit loop. [10]
Vi2 = Vo1
Vi 2 = Vo1
T
B
C
d
Vi1 = V o2
d
Vi1 = V o2
Download