Power Optimized Memory Organization Using Multi-Bit-Flip-Flop Approach and Enhanced Ring Counter

advertisement
International Journal of Engineering Trends and Technology (IJETT) – Volume 5 Number 7- Nov 2013
Power Optimized Memory Organization Using
Multi-Bit-Flip-Flop Approach and Enhanced Ring
Counter
1
2
1
kiran Kumar.G
M.Venkara Rao
M.Tech, VLSI&E.S Dept, Amrita Sai Institute of science & Tech, Paritala
2
Assist.Prof, Amrita Sai Institute of science & Tech, Paritala
ABSTRACT
significant portion of their circuits [1]–[3]. Such
Power reduction has become a vital design goal for
serial access memory is needed in temporary
sophisticated design applications, whether mobile
storage of signals that are being processed, e.g.,
or not. dropping power consumption in design
delay of one line of video signals, delay of signals
enables better, cheaper products to be designed and
within a fast Fourier transform (FFT) architectures
power-related chip failures to be minimized.
[4], and delay of signals in a delay correlator [2].
Researchers have shown that multi-bit flip-flop is
Currently, most circuits adopt static random access
an effective method for clock power consumption
memory (SRAM) plus some control/addressing
reduction. The underlying idea behind multi-bit
logic to implement delay buffers. For smaller-
flip-flop method is to eliminate total inverter
length delay buffers, shift register can be used
number by sharing the inverters in the flip-flops.
instead. The former approach is convenient since
In this paper, we will review multi-bit flip-flop
SRAM compilers are readily available and they are
concepts, and introduce the benefits of using multi-
optimized to generate memory modules with low
bit flip-flops in our design. Then, we will show
power consumption and high operation speed with
how to implement multi-bit flip-flop methodology
a compact cell size. The latter approach is also
by XILINX Design Compiler. Experimental results
convenient since shift register can be easily
indicate that multi-bit flip-flop is very effective and
synthesized, though it may consume much power
efficient method in lower-power designs.
due to unnecessary data movement. Besides, once
KEYWORDS:
more smaller flip-flops are replaced by larger
Multi-bit-flip-flop,
C-elements,
Ring counter, clock gating, Merging.
multi-bit flip-flops, device variations in the
INTRODUCTION
corresponding circuit can be effectively reduced.
Portable multimedia and communication devices
As CMOS technology progresses, the driving
have experienced explosive growth recently.
capability of an inverter-based clock buffer
Longer battery life is one of the crucial factors in
increases significantly. The driving capability of a
the widespread success of these products. As such,
clock buffer can be evaluated by the number of
low-power circuit design for multimedia and
minimum-sized inverters that it can drive on a
wireless communication applications has become
given rising or falling time. Fig. 1 shows the
very important. In many such products, delay
maximum number of minimum-sized inverters that
buffers (line buffers, delay lines) make up a
can be driven by a clock buffer in different
ISSN: 2231-5381
http://www.ijettjournal.org
Page 377
International Journal of Engineering Trends and Technology (IJETT) – Volume 5 Number 7- Nov 2013
processes. Because of this phenomenon, several
like 65nm and beyond, the minimum size of clock
flip-flops can share a common clock buffer to
drivers can drive more than one flip-flop. Merging
avoid unnecessary power waste. However, the
single-bit flip-flops into one multi-bit flip-flop can
locations of some flip-flops would be changed after
avoid duplicate inverters, and lower the total clock
this replacement, and thus the wire lengths of nets
dynamic power consumption. The total area
connecting pins to a flip-flop are also changed. To
contributing to flip-flops can be reduced as well.
avoid violating the timing constraints, we restrict
By using multi-bit flip-flop to implement ASIC
that the wire lengths of nets connecting pins to a
design, users can enjoy the following benefits:
flip-flop cannot be longer than specified values
l
after this process. Besides, to guarantee that a new
sequential banked components
flipflop can be placed within the desired region, we
l
also need to consider the area capacity of the
transistors and optimized transistor-level layout
region.
l
Lower power consumption by the clock in
Smaller area and delay, due to shared
Reduced clock skew in sequential gates
MultiBit Flip-Flop Concept
In this section, we will introduce multi-bit flip-flop
conception. Before that, we will review single-bit
flip-flop. Figure 2 shows an example of single-bit
flip-flop. A single-bit flip-flop has two latches
(Master latch and slave latch). The latches need
“Clk” and “Clk’ ” signal to perform operations,
such as Figure2 shows. In order to have better
delay from Clk-> Q, we will regenerate “Clk” from
“Clk’ ”. Hence we will have two inverters in the
clock path.
Figure 3: An example of merging two 1-bit flipflops into one 2-bit flip-flop.
Figure 4 shows an example of dual-bit flip-flop
cell. It has two data input pins, two data output
pins, one clock pin and reset pin. Use dual-bit flipflop can get the benefits of lower power
consumption then single-bit, and almost no other
additional costs to pay. Figure 5 shows the true
table of dual-bit flip-flop cell. We could find that
when CK is positive edge, the value of Q1 will
Figure 2: Single-Bit Flip-Flop
pass to D1, and the value of Q2 will pass to D2. Or
Figure 3 shows an example of merging two 1-bit
Q1 and Q2 will keep original value.
flip-flops into one 2-bit flip-flop. Each 1-bit flipflop contains two inverters, master-latch and slavelatch. Due to the manufacturing rules, inverters in
flip-flops tend to be oversized. As the process
technology advances into smaller geometry nodes
ISSN: 2231-5381
http://www.ijettjournal.org
Page 378
International Journal of Engineering Trends and Technology (IJETT) – Volume 5 Number 7- Nov 2013
source code. This is the best method if you know
the design’s layout and can determine where
multibit cells might have the most impact. For
example, if the data path and control logic are well
separated
or
if
you
have
done
early
floorplanning. The second methodology directs
multibit library cell inference from an already
Figure 4: A dual-bit flip-flop cell.
mapped design. This method is most useful after
you complete an initial floorplan or placement and
determine which areas can benefit from the use of
multibit cells. Design Compiler supports only
multi-bit
cell
library
that
have
identical
functionality for each bit. The multibit library cell
interfaces must be either fully parallel or fully
global. For example, if you want to infer a 4-bit
banked flip-flop with an asynchronous reset, the
Figure 5: The true table of dual-bit flip-flop cell.
rest signal must be either different for each bit or
3. MultiBit Flip-Flop Methodology
shared among all 4 bits. Design Compiler cannot
In the section, we will introduce that how to use
infer a multibit register if the first and second bits
Design Compiler and Low power multi-bit flip-
share one asynchronous reset but the third and
flop to implement ASIC design. Using Multi-Bit
fourth bits share another reset. In that case, Design
Flip-flop
Compiler does not infer a multi-bit flip-flop and it
is
an
effective
and
efficient
implementation methodology to reduce the power
will use 4 single-bit flip-flops instead
consumption by merging single-bit flip-flop
The Multi-Bit Flip-FlopMethodology Provided
The criteria of using multi-bit flip-flop
Multi-bit flip-flop cells are capable of decreasing
by Design Compiler
the power consumption because they have shared
Design Compiler can synthesize the following to
inverter inside the flip-flop. Meanwhile, they can
multi-bit library cells:
minimize clock skew at the same time. To obtain
these benefits, the ASIC design must meet the

Flip-flops
following requirements. The single-bit flip-flops

Latches
we want to replace with multi-bit flip-flop must

Master-slave circuits
have same clock condition and same set/reset

Multiplexers
condition.

Three-state circuits
hdlin_infer_multibit
When
you
as
set
the
default_all,
variable
Design
Compiler will use multi-bit flip-flop to replace bus
Design Compiler provides two methodologies for
type single-bit flip-flops. For non-bus condition,
mapping logic to multibit library cells. (You can
your must use create_multibit to identify the multi-
use either methodology or a combination of the
bit flip-flop candidates.
two.) The first directs cell inference from the HDL
ISSN: 2231-5381
http://www.ijettjournal.org
Page 379
International Journal of Engineering Trends and Technology (IJETT) – Volume 5 Number 7- Nov 2013
Memory organization between each node:
The C-element
is
an essential
element
in
asynchronous circuits for handshaking.
Dev
ice
1
Gate
d
drive
Mem
ory
Gated
driver
tree
Devi
ce 2
GATED DRIVER TREE:
To save area, the memory module of a delay buffer
is often in the form of an SRAM array with
Rin
g
In the proposed cou
memory organization, several
input/output data bus as in [6]. Special read/write
power reduction techniques are adopted. Mainly,
memory cells, only two words will be activated:
these circuit techniques are designed with a view to
one is written by the input data and the other is
decreasing the loading on high fan-out nets, e.g.,
read to the output. Driving the input signal all the
clock and read/write ports.
way to all memory cells seems to be a waste of
circuitry, such as a sense amplifier, is needed for
fast and low-power operations. However, of all the
power. The same can be said for the read circuitry
RING COUNTER:
of the output port. In light of the previous gatedclock tree technique, we shall apply the same idea
to the input driving/output sensing
This ring counter proposed to replace the R–S flipflop by a C-element and to use tree-structured
clock drivers with gating so as to greatly reduce the
loading on active clock drivers. Additionally, DET
flip-flops are used to reduce the clock rate to half
and thus also reduce the power consumption on the
clock signal. The proposed ring counter with
hierarchical clock gating and thecontrol logic is
circuitry in the memory module of the delay buffer.
The memory words are also grouped into blocks.
Each memory block associates with one DET flipflop block in the proposed ring counter and one
DET flip-flop output addresses a corresponding
memory word for read-out and at the same time
addresses the word that was read one-clock earlier
for write-in.
shown in above figure. Each block contains one Celement to control the delivery of the local clock
signal “CLK ”to the DET flip-flops, and only the
“CKE signals along the path passing the global
clock source to the local clock signal are active.
The “gate” signal (CKE) can also be derived from
the output of the DET flip-flops in the ring counter.
ISSN: 2231-5381
http://www.ijettjournal.org
Page 380
International Journal of Engineering Trends and Technology (IJETT) – Volume 5 Number 7- Nov 2013
RESULTS:
Large Scale Integr. (VLSI) Syst., vol. 3, no. 1, pp.
49– 58, Mar. 1995.
[4] J. F. Tabor, “Noise reduction using low weight
and constant weight coding techniques,” M.Sc.
thesis, Artif. Intell. Lab., MIT, Cambridge, MA,
1990.
[5] Y. Benezeth, P. Jodoin, B. Emile, H. Laurent,
and C. Rosenberger, “Review and evaluation of
commonly-implemented background subtraction
algorithms,” in IEEE International Conference on
Pattern Recognition (ICPR), pp. 1–4, December
2008
[6] M. R. Stan and W. P. Burleson, “Coding a
terminated bus for low power,” in Proc. 5th
GLSVLSI, 1995, pp. 70–73.
CONCLUSION: Using Multi-Bit Flip-flop is an
[7] H. Zhang, V. George, and J. M. Rabaey, “Low-
effective
swing on-chip signaling techniques: Effectiveness
and
efficient
implementation
methodology to reduce the power consumption by
and
merging single-bit flip-flop. In this paper, we have
ScaleIntegr. (VLSI) Syst., vol. 8, no. 3, pp. 264–
implemented
272, Jun. 2000.
Compiler
design
and
with
Faraday’s
XILINX
IEEE
Trans.
Very
Large
flip-flop.
[8] D. Parks and S. Fels, “Evaluation of
Experimental results indicate that multi-bit flip-
background subtraction algorithms with post-
flop is very effective and efficient method in
processing,” in IEEE International Conference on
lower-power
this
Advanced Video and Signal Based Surveillance,
methodology to implement real ASIC project in the
(Santa Fe (New Mexico, USA)), pp. 192–199,
future.
September 2008.
designs.
We
multi-bit
Design
robustness,”
will
use
REFERENCES:
[1] W. Eberle et al., “80-Mb/s QPSK and 72-Mb/s
64-QAM flexible and scalable digital OFDM
transceiver ASICs for wireless local area networks
in the 5-GHz band,” IEEE J. Solid-State Circuits,
vol. 36, no. 11, pp. 1829–1838, Nov. 2001.
[2] M. L. Liou, P. H. Lin, C. J. Jan, S. C. Lin, and
T. D. Chiueh, “Design of an OFDM baseband
receiver with space diversity,” IEE Proc.Commun.,
vol. 153, no. 6, pp. 894–900, Dec. 2006.
[3] M. R. Stan and W. P. Burleson, “Bus-invert
coding for low-power I/O,” IEEE Trans. Very
ISSN: 2231-5381
http://www.ijettjournal.org
Page 381
Download