Design and Simulation of an 8-bit Memory-less Pipelined Full Adder Ayswarya Kulasekaran, Advisor :Dr.Xingguo Xiong Department of Electrical Engineering, University of Bridgeport, Bridgeport, CT 06604 Abstract Pipelining can effectively cut short the critical path of a VLSI circuit and improve its throughput. As a result, it has been widely used in modern VLSI design, especially for microprocessors and other high-performance computing applications. However, intermediate buffer stages are needed between two neighboring pipeline stages to temporarily hold the data when it is being moved from one stage to another. These pipeline registers increase the hardware implementation cost and degrade the system performance due to their internal delays. In order to overcome this hardware overhead, a novel memory-less pipeline architecture has been reported. Such memory-less pipeline architecture ingeniously replaces the intermediate buffer stages with a special clocking phase called the memory phase in addition to the pre-charge and evaluation phase that exist in the clocking scheme. By introducing a memory clock phase, it eliminates the need for intermediate buffer stages. This greatly reduces the hardware overhead in pipeline structure and improves the performance of the circuit. In this poster, we applied this memory-less pipeline architecture to the implementation of an 8bit NORA full adder. The 8-bit memory-less pipelined NORA full adder is designed and simulated in PSPICE. In order for comparison, a traditional pipelined NORA full adder without memory-less architecture is also designed. Simulation results verify the correct function of the memory-less NORA full adder. Compared to the traditional NORA full adder design, memory-less architecture leads to less hardware implementation cost and better performance. Memory-less Circuit Design Pipeline architecture divides the signal flow into separate stages, hence reducing the critical path length of a VLSI circuit, and leading to effective improvements in its throughput. As a result, it has been widely used in modern VLSI design, especially for microprocessors and other high-performance computing applications. Pipelining can also be used in NORA (NO-Race logic) CMOS dynamic logic based on C2MOS latches. An example NORA CMOS dynamic pipelined circuit is shown in Figure 1. The first stage is a Φ’ block and the second stage is a Φ block, with a C2MOS latch pipeline register in between. When CLK’=1, the first block is in pre-charge phase and the second block is in evaluation phase. While CLK’=0, the first block is in evaluation phase and the second block is in pre-charge phase. A pipeline Figure 1: NORA CMOS pipeline circuit register is inserted in between to ensure that signal can pass only one stage in each clock cycle. The data from the output of the previous stage will be temporarily latched in the pipeline register, and applied to next stage in next clock cycle. That is, the pipeline register (C2MOS latch) acts as an intermediate buffer between the pipelined stages of the circuit. However, these pipeline registers increase the hardware implementation cost and degrade the system performance due to their internal delays. In [1], a novel memory-less architecture is introduced to eliminate the need of pipeline registers, but maintain the same pipeline data path by introducing speciallydesigned clock signals. This greatly reduces the hardware overhead, and leads to improved circuit speed. Memory-less pipelined architecture (shown in Figure 3) effectively eliminates the pipeline registers. This is accomplished by applying special clocking scheme different from the one applied to the original circuit in Figure 1. The new clocking scheme which is used to eradicate the pipeline registers is depicted in Figure 4. Compared to the clock signal applied to original NORA pipeline structure (Figure 2), the new clocking scheme has an additional phase named the Memory phase apart from the pre-charging and evaluation phases respectively [1]. This memory phase plays the role of an intermediate buffer stage. That is, it holds the computed output of one pipelined stage and passes it on to the other stage down the pipeline at the specified clock interval. Figure 6. PSPICE schematic design of original 8-bit NORA CMOS full adder circuit 2. Memory-less Pipeline Architecture Circuit Design The memory-less pipelined architecture has some significant modification over the original circuit. It has been designed with the constraints of reducing the transistor count and avoiding the use of buffer registers. Further, the even blocks should be designed differently from the odd blocks in the carry propagation stages because they take inverted inputs. As a result, Demorgan’s rule is used to derive the design for even blocks. An example PSPICE schematic design of even block for carry-out propagation Figure 8: Demorgan circuit design stage is shown in Figure 8. The circuit in the even block is designed by applying Demorgan’s rule to the circuit in the odd block. This is done because the output produced by the carry unit in the odd block will not be inverted. As a result we need to design a circuit that could possibly accept inverted inputs directly. The data flow in the memory-less pipeline structure is controlled by four clock signals. Each clock signal consists of three phases: Precharge, Evaluate and Memory phase respectively. Each clock signal is delayed by 1/3 of its cycle time. During the pre-charge phase the PMOS transistor is active, during the evaluation phase the NMOS transistor is active. Finally during the memory phase both PMOS and NMOS are inactive leading to no switching activity in the circuit. The purpose of the memory phase is to hold the value computed during the evaluation phase of previous stage and simultaneously pass it on to the next stage as input and to the sum circuitry within the block before any change in the clock signal could occur. The PSPICE schematic design and power simulation of the 8-bit memory-less full adder is shown in Figure 9 and 10 separately. Based on simulation, the average power of memory-less pipelined adder circuit for the same given input pattern sequence (t=0~2400ns) is found to be: Pmemory-less(T)= 109.54 µW. Figure 9: PSPICE schematic design of 8-bit memory-less pipelined full adder Figure 2: The clock signal applied to original circuit Figure 3: Memory-less pipelined architecture [1] Figure 4: Clock signals applied to memory-less circuit [1] Results and Discussion Figure 7. PSPICE power simulation curve of original 8-bit NORA CMOS full adder circuit Fig 10: PSPICE power simulation of 8-bit memory-less pipeline full adder A comparison of the power consumption between original NORA CMOS pipelined full adder and memory-less pipelined full adder is shown in Table 1. From the result, it shows that the Memory-less pipeline architecture leads to a significant power saving (33%) compared to the original circuit for the given input pattern sequence. Furthermore, memory-less pipeline architecture leads to 20.8% transistor count saving compared to NORA CMOS pipeline full adder design. Table 1. Power/Energy comparison of original and memory-less pipeline architectures Comparison Item Original NORA CMOS adder Memory-less full adder 1. Original Circuit A non-pipelined dynamic NP-CMOS full adder design is shown in Figure 5 (only 2 bit slices are shown here). The alternating even and odd carry stages are realized using NMOS and PMOS networks respectively. The original pipelined NORA CMOS full adder is constructed by adding pipeline registers (C2MOS latches) between neighboring stages. VDD VDD VDD VDD f f PSPICE schematic design of the S1 f f original 8-bit pipelined NORA CMOS full Ci1 B B1 A1 1 adder circuit is shown in Figure 6. We A1 A1 B1 Ci1 selected 24 random input patterns (each A1 B1 f C f pattern lasts for 100ns) for power i2 f f simulation. To ensure fair comparison, VDD the same input pattern sequence will VDD VDD f also be used for the memory-less f f B0 design. The PSPICE power simulation C i1 A0 A 0 B0 Ci0 A0 curve of the original pipelined NORA A0 B0 B0 Ci0 CMOS full adder circuit for the given S0 f f f input sequence is shown in Figure 7. As Ci0 shown in the figure, the average power Carry Path of original 8-bit pipelined NORA CMOS Figure 5: Dynamic NP-CMOS full adder full adder circuit for given input pattern design (only 2 bits are shown) sequence (t=0~2400ns) is found to be: PNORA(T)= 253.21 µW Average Power (t=0~2400ns) Energy Consumption (t=0~2400ns) Transistor Count Saving (memoryless compared to original design) 253.21 µW 109.54 µW 56.7% 6.077x10-10 J 2.629x10-10 J 56.7% 192 152 20.8% Conclusions and Future Work In this poster, the design and simulation of a memory-less 8-bit pipelined full adder circuit is proposed. Memory-less pipeline circuits utilize specially-designed clock scheme to eliminate the need of pipeline registers. As a result, this leads to effective power savings and reduced hardware cost. Simulation results demonstrate that compared to the original NORA pipelined design with pipeline registers, memory-less pipeline full adder shows significant power savings of 56.7% for the given input pattern sequence. It also provides a transistor count savings of 21%. In the future, we will further look into the ways to reduce the hardware overhead in memory-less pipeline architecture, so that even more power saving can be achieved. Reference [1] Themistoklis Haniotakis, Zaher Wwda,Yiorgos Tsiatouhas, “Memory-less pipeline dynamic circuit design technique", Proceedings of 2010 IEEE Annual symposium on VLSI.