International Journal of Engineering Trends and Technology (IJETT) – Volume 5 Number 7- Nov 2013 Power Optimized Memory Organization Using Multi-Bit-Flip-Flop Approach and Enhanced Ring Counter 1 2 1 kiran Kumar.G M.Venkara Rao M.Tech, VLSI&E.S Dept, Amrita Sai Institute of science & Tech, Paritala 2 Assist.Prof, Amrita Sai Institute of science & Tech, Paritala ABSTRACT significant portion of their circuits [1]–[3]. Such Power reduction has become a vital design goal for serial access memory is needed in temporary sophisticated design applications, whether mobile storage of signals that are being processed, e.g., or not. dropping power consumption in design delay of one line of video signals, delay of signals enables better, cheaper products to be designed and within a fast Fourier transform (FFT) architectures power-related chip failures to be minimized. [4], and delay of signals in a delay correlator [2]. Researchers have shown that multi-bit flip-flop is Currently, most circuits adopt static random access an effective method for clock power consumption memory (SRAM) plus some control/addressing reduction. The underlying idea behind multi-bit logic to implement delay buffers. For smaller- flip-flop method is to eliminate total inverter length delay buffers, shift register can be used number by sharing the inverters in the flip-flops. instead. The former approach is convenient since In this paper, we will review multi-bit flip-flop SRAM compilers are readily available and they are concepts, and introduce the benefits of using multi- optimized to generate memory modules with low bit flip-flops in our design. Then, we will show power consumption and high operation speed with how to implement multi-bit flip-flop methodology a compact cell size. The latter approach is also by XILINX Design Compiler. Experimental results convenient since shift register can be easily indicate that multi-bit flip-flop is very effective and synthesized, though it may consume much power efficient method in lower-power designs. due to unnecessary data movement. Besides, once KEYWORDS: more smaller flip-flops are replaced by larger Multi-bit-flip-flop, C-elements, Ring counter, clock gating, Merging. multi-bit flip-flops, device variations in the INTRODUCTION corresponding circuit can be effectively reduced. Portable multimedia and communication devices As CMOS technology progresses, the driving have experienced explosive growth recently. capability of an inverter-based clock buffer Longer battery life is one of the crucial factors in increases significantly. The driving capability of a the widespread success of these products. As such, clock buffer can be evaluated by the number of low-power circuit design for multimedia and minimum-sized inverters that it can drive on a wireless communication applications has become given rising or falling time. Fig. 1 shows the very important. In many such products, delay maximum number of minimum-sized inverters that buffers (line buffers, delay lines) make up a can be driven by a clock buffer in different ISSN: 2231-5381 http://www.ijettjournal.org Page 377 International Journal of Engineering Trends and Technology (IJETT) – Volume 5 Number 7- Nov 2013 processes. Because of this phenomenon, several like 65nm and beyond, the minimum size of clock flip-flops can share a common clock buffer to drivers can drive more than one flip-flop. Merging avoid unnecessary power waste. However, the single-bit flip-flops into one multi-bit flip-flop can locations of some flip-flops would be changed after avoid duplicate inverters, and lower the total clock this replacement, and thus the wire lengths of nets dynamic power consumption. The total area connecting pins to a flip-flop are also changed. To contributing to flip-flops can be reduced as well. avoid violating the timing constraints, we restrict By using multi-bit flip-flop to implement ASIC that the wire lengths of nets connecting pins to a design, users can enjoy the following benefits: flip-flop cannot be longer than specified values l after this process. Besides, to guarantee that a new sequential banked components flipflop can be placed within the desired region, we l also need to consider the area capacity of the transistors and optimized transistor-level layout region. l Lower power consumption by the clock in Smaller area and delay, due to shared Reduced clock skew in sequential gates MultiBit Flip-Flop Concept In this section, we will introduce multi-bit flip-flop conception. Before that, we will review single-bit flip-flop. Figure 2 shows an example of single-bit flip-flop. A single-bit flip-flop has two latches (Master latch and slave latch). The latches need “Clk” and “Clk’ ” signal to perform operations, such as Figure2 shows. In order to have better delay from Clk-> Q, we will regenerate “Clk” from “Clk’ ”. Hence we will have two inverters in the clock path. Figure 3: An example of merging two 1-bit flipflops into one 2-bit flip-flop. Figure 4 shows an example of dual-bit flip-flop cell. It has two data input pins, two data output pins, one clock pin and reset pin. Use dual-bit flipflop can get the benefits of lower power consumption then single-bit, and almost no other additional costs to pay. Figure 5 shows the true table of dual-bit flip-flop cell. We could find that when CK is positive edge, the value of Q1 will Figure 2: Single-Bit Flip-Flop pass to D1, and the value of Q2 will pass to D2. Or Figure 3 shows an example of merging two 1-bit Q1 and Q2 will keep original value. flip-flops into one 2-bit flip-flop. Each 1-bit flipflop contains two inverters, master-latch and slavelatch. Due to the manufacturing rules, inverters in flip-flops tend to be oversized. As the process technology advances into smaller geometry nodes ISSN: 2231-5381 http://www.ijettjournal.org Page 378 International Journal of Engineering Trends and Technology (IJETT) – Volume 5 Number 7- Nov 2013 source code. This is the best method if you know the design’s layout and can determine where multibit cells might have the most impact. For example, if the data path and control logic are well separated or if you have done early floorplanning. The second methodology directs multibit library cell inference from an already Figure 4: A dual-bit flip-flop cell. mapped design. This method is most useful after you complete an initial floorplan or placement and determine which areas can benefit from the use of multibit cells. Design Compiler supports only multi-bit cell library that have identical functionality for each bit. The multibit library cell interfaces must be either fully parallel or fully global. For example, if you want to infer a 4-bit banked flip-flop with an asynchronous reset, the Figure 5: The true table of dual-bit flip-flop cell. rest signal must be either different for each bit or 3. MultiBit Flip-Flop Methodology shared among all 4 bits. Design Compiler cannot In the section, we will introduce that how to use infer a multibit register if the first and second bits Design Compiler and Low power multi-bit flip- share one asynchronous reset but the third and flop to implement ASIC design. Using Multi-Bit fourth bits share another reset. In that case, Design Flip-flop Compiler does not infer a multi-bit flip-flop and it is an effective and efficient implementation methodology to reduce the power will use 4 single-bit flip-flops instead consumption by merging single-bit flip-flop The Multi-Bit Flip-FlopMethodology Provided The criteria of using multi-bit flip-flop Multi-bit flip-flop cells are capable of decreasing by Design Compiler the power consumption because they have shared Design Compiler can synthesize the following to inverter inside the flip-flop. Meanwhile, they can multi-bit library cells: minimize clock skew at the same time. To obtain these benefits, the ASIC design must meet the Flip-flops following requirements. The single-bit flip-flops Latches we want to replace with multi-bit flip-flop must Master-slave circuits have same clock condition and same set/reset Multiplexers condition. Three-state circuits hdlin_infer_multibit When you as set the default_all, variable Design Compiler will use multi-bit flip-flop to replace bus Design Compiler provides two methodologies for type single-bit flip-flops. For non-bus condition, mapping logic to multibit library cells. (You can your must use create_multibit to identify the multi- use either methodology or a combination of the bit flip-flop candidates. two.) The first directs cell inference from the HDL ISSN: 2231-5381 http://www.ijettjournal.org Page 379 International Journal of Engineering Trends and Technology (IJETT) – Volume 5 Number 7- Nov 2013 Memory organization between each node: The C-element is an essential element in asynchronous circuits for handshaking. Dev ice 1 Gate d drive Mem ory Gated driver tree Devi ce 2 GATED DRIVER TREE: To save area, the memory module of a delay buffer is often in the form of an SRAM array with Rin g In the proposed cou memory organization, several input/output data bus as in [6]. Special read/write power reduction techniques are adopted. Mainly, memory cells, only two words will be activated: these circuit techniques are designed with a view to one is written by the input data and the other is decreasing the loading on high fan-out nets, e.g., read to the output. Driving the input signal all the clock and read/write ports. way to all memory cells seems to be a waste of circuitry, such as a sense amplifier, is needed for fast and low-power operations. However, of all the power. The same can be said for the read circuitry RING COUNTER: of the output port. In light of the previous gatedclock tree technique, we shall apply the same idea to the input driving/output sensing This ring counter proposed to replace the R–S flipflop by a C-element and to use tree-structured clock drivers with gating so as to greatly reduce the loading on active clock drivers. Additionally, DET flip-flops are used to reduce the clock rate to half and thus also reduce the power consumption on the clock signal. The proposed ring counter with hierarchical clock gating and thecontrol logic is circuitry in the memory module of the delay buffer. The memory words are also grouped into blocks. Each memory block associates with one DET flipflop block in the proposed ring counter and one DET flip-flop output addresses a corresponding memory word for read-out and at the same time addresses the word that was read one-clock earlier for write-in. shown in above figure. Each block contains one Celement to control the delivery of the local clock signal “CLK ”to the DET flip-flops, and only the “CKE signals along the path passing the global clock source to the local clock signal are active. The “gate” signal (CKE) can also be derived from the output of the DET flip-flops in the ring counter. ISSN: 2231-5381 http://www.ijettjournal.org Page 380 International Journal of Engineering Trends and Technology (IJETT) – Volume 5 Number 7- Nov 2013 RESULTS: Large Scale Integr. (VLSI) Syst., vol. 3, no. 1, pp. 49– 58, Mar. 1995. [4] J. F. Tabor, “Noise reduction using low weight and constant weight coding techniques,” M.Sc. thesis, Artif. Intell. Lab., MIT, Cambridge, MA, 1990. [5] Y. Benezeth, P. Jodoin, B. Emile, H. Laurent, and C. Rosenberger, “Review and evaluation of commonly-implemented background subtraction algorithms,” in IEEE International Conference on Pattern Recognition (ICPR), pp. 1–4, December 2008 [6] M. R. Stan and W. P. Burleson, “Coding a terminated bus for low power,” in Proc. 5th GLSVLSI, 1995, pp. 70–73. CONCLUSION: Using Multi-Bit Flip-flop is an [7] H. Zhang, V. George, and J. M. Rabaey, “Low- effective swing on-chip signaling techniques: Effectiveness and efficient implementation methodology to reduce the power consumption by and merging single-bit flip-flop. In this paper, we have ScaleIntegr. (VLSI) Syst., vol. 8, no. 3, pp. 264– implemented 272, Jun. 2000. Compiler design and with Faraday’s XILINX IEEE Trans. Very Large flip-flop. [8] D. Parks and S. Fels, “Evaluation of Experimental results indicate that multi-bit flip- background subtraction algorithms with post- flop is very effective and efficient method in processing,” in IEEE International Conference on lower-power this Advanced Video and Signal Based Surveillance, methodology to implement real ASIC project in the (Santa Fe (New Mexico, USA)), pp. 192–199, future. September 2008. designs. We multi-bit Design robustness,” will use REFERENCES: [1] W. Eberle et al., “80-Mb/s QPSK and 72-Mb/s 64-QAM flexible and scalable digital OFDM transceiver ASICs for wireless local area networks in the 5-GHz band,” IEEE J. Solid-State Circuits, vol. 36, no. 11, pp. 1829–1838, Nov. 2001. [2] M. L. Liou, P. H. Lin, C. J. Jan, S. C. Lin, and T. D. Chiueh, “Design of an OFDM baseband receiver with space diversity,” IEE Proc.Commun., vol. 153, no. 6, pp. 894–900, Dec. 2006. [3] M. R. Stan and W. P. Burleson, “Bus-invert coding for low-power I/O,” IEEE Trans. Very ISSN: 2231-5381 http://www.ijettjournal.org Page 381