Content Addressable Memory Using XNOR CAM Cell

advertisement
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 1 (2016) pp 28-32
© Research India Publications. http://www.ripublication.com
Content Addressable Memory Using XNOR CAM Cell
S.Vijayalakshmi1, B.Elango2, V.Nagarajan3
Department of Electronics and Communication Engineering,
Adhiparasakthi Engineering College, Melmaruvathur, Tamilnadu, India.
E-mail: sankarviji7@gmail.com1, beim.18@gmail.com2,
nagarajanece31@rediffmail.com3
match-lines or reduce the voltage swing indirectly on the
match-line by using some current based techniques [12]. In
recent research without foregoing area or speed, reduction in
power consumption is the important thread in large capacity
CAMs. CAM consists of two types such as Binary CAM
(BCAM) and Ternary CAM (TCAM) where BCAM contains
0’s, 1’s and TCAM contains 0’s, 1’s and don’t care set. In
this paper, we use BCAM for reducing complexity. Recently,
Based on Sparse Clustered Networks some of the associative
memories have been introduced [13-15]. The proposed
architecture contain SCN Based CAM array.
The organization of this paper is mentioned below. Section II
gives about the CAM basics. Section III describes about the
conventional design using XOR 9T CAM Cell. Section IV
describes about the proposed design using XNOR 4T CAM
Cell. In Section V, the SCN Based CAM XNOR Array has
been discussed. Section VI explains about the simulation
results. Throughout this paper the simulations are obtained
using CADENCE VIRTUOSO GPDK 180-nm CMOS
technology. At last, Section VII included with conclusions
and future work.
Abstract
One of the special types of Computer Memory is said to be
as Content Addressable Memory. It is also called as
associative array or associative storage, associative memory
which can be frequently used in very high speed searching
applications such as databases, associative computing,
lookup tables and networking. CAM is one type of functional
memory which contains huge amount of stored data where
the input search data will compare with the stored data and
the matching data address will be returned. While comparing
to other hardware and software search systems, CAM is
faster because it have a single clock cycle throughput. In this
paper, we propose an 8-bit XNOR CAM cell to reduce the
power efficiency instead of XOR CAM cell. The proposed
architecture design is based on a sparse clustered network
using BCAM. While searching, this design process will
eliminate the parallel comparisons.
Keywords: Content Addressable Memory, Associative
Memory, Sparse Clustered Networks.
Introduction
Basics of CAM
One of the memories which access its contents instead of
address is called as a content addressable memory. In this
CAM memory, to find a match data, a search data word can
be compared against already stored data in parallel
comparison process [1]. In CAM memory, when the search
data word is given as an input the matching data word will
appear as an output which resembles as a single clock cycle
if it occurs. For fast look-up operations, there are many
applications in CAM such as translation look-aside buffers
(TLBs) [2-3], network routers[4-5], database accelerators,
parametric curve extraction, image processing, [6], Hough
transformation [7], Huffman coding/decoding [8], virus
detection [9] Lempel–Ziv compression [10], and image
coding [11].
Nowadays, reducing power consumption is the major
challenges in high capacity CAMs. Meanwhile, power
consumption is proportional to the memory size, when some
applications needed larger size of CAM automatically there
will be increase in CAM power consumption. In CAM power
consumption, there are two main components such as the
search-line power consumption and the match-line power
consumption. For power consumption reduction in CAM, the
previous efforts has been focused on power reduction in
match-line by reducing the voltage swing directly on the
In a predictable CAM Array, we used Static Random Access
memory (SRAM) as a block which simply store the data
word for reference. Each and every access consists of the
code, when the input data word is matched with the already
stored data word it will points the location of the data word
in the SRAM block. So, whenever there is a need for data
searching in the SRAM block, it is easier to search for its
parallel code. Thus, the code can be shorter than store
SRAM-data which required less bit comparisons.
For example, the classic CAM array consists of eight entries
having random order bits (i.e., first entry may consists of two
bits second entry may consists of 10bits eighth entry may
contain five bits). The memory relates the search data which
can store the input bits on the differential search lines (SLs).
So, the comparison process is easy between the search data
and all CAM entries. Each and every word in the CAM is
involved with the general match line (ML) between its basic
bits. The match line shows whether the input bits are
matched or not. For increasing the search operation
performance, a sense amplifier is used for each match line
because generally the match lines are highly capacitive.
The classic BCAM cell is integrated with comparator
circuitry and 6T SRAM cell. Any one of the XNOR (or)
NAND type structure or an XOR (or) NOR type structure is
28
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 1 (2016) pp 28-32
© Research India Publications. http://www.ripublication.com
pre-charged low by disabling the pull down paths to separate
the match line from ground.
used in comparator circuitry [1]. In this paper, we use the
XNOR CAM structure because while comparing with XOR
CAM structure, it consumes less power.
Conventional CAM 9T XOR Cell Design
Fig. 3.1 shows a CAM 9T XOR Circuit design. The NOR 9T
cell implements the association between the corresponding
stored bit and the search data on the corresponding search
line using five association transistors. All those transistors
are all classically minimum in size which is used to maintain
high cell density and implement dynamic XNOR logic gate
pull down path with select line and stored bit inputs. From
the match line, each couple of transistors forms a pull down
path. When match line is connected to ground, the mismatch
of select line and stored bit data activates any one of the pull
down paths.
Figure 3.2: CAM XOR ARRAY schematic design in
cadence tool
Then, the match lines are pre-charged highly by using many
transistors with the separated pull-down paths. At last, to
trigger the evaluation phase match line, the search lines are
focused to the search word values. If there is no liberation
path to ground then the voltage match line stays high in the
case of matched data or else the match line discharges in case
of a mismatch, then there should be least one path to ground.
The parallel full-rail match output generates by using the
match line sense amplifier which senses the voltage on match
line [15].
Proposed CAM 4T XNOR Cell Design
The NAND 4T cell implements the association between the
corresponding stored bit and the search data on the
corresponding search line using four association transistors.
All those transistors are all classically minimum in size
which is used to maintain high cell density. If each and every
cell in a word is placed in the match line then only the entire
word match condition occurs.
Figure 3.1: CAM 9T XOR schematic design in cadence tool
If the match line is not connected to ground, the match of
select line and stored bit deactivates together pull down
paths. To arrange a CAM word many cells have connected in
parallel by connecting each cell match line to its adjacent cell
match line. Thus, the CAM NOR nature becomes clear. The
parallel pull down path connection similar to the CMOS
NOR logic gate pull down path. Only if each and every cell
in the word is matched with the stored data then there will be
match condition on a match line.
To neither form a CAM NOR match line all CAM NOR cells
connected in parallel. The classical NOR cycle search
operation shows in Fig.3.2 which divided into three phases as
match line pre-charge, search line pre-charge and match line
evaluation. Initially, in each CAM cell the search lines are
The proposed 4T CAM schematic design is shown in Fig.4.1
which consists of four nmos transistors where two transistors
NM2 and NM3 are connected in series which acts as storage
elements so it is used for data storage and the an additional
two transistors NM0 and NM1 are connected in parallel are
used to write the data.
If the transistors NM0 and NM1 are in ON condition the data
could be send to the transistors NM2 and NM3 which can
read through the nodes. Using match line usually the match
operation can be done which is connected to the XOR type
transistors output which are completed using two transistors
NM2 and NM3. If logic1 appears as output of the match line,
it shows that the input data and the data stored are matched
by using pre-charge transistor.
29
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 1 (2016) pp 28-32
© Research India Publications. http://www.ripublication.com
In network training, the SCN based classifier binary values
connections show the relation of the input code and the
matching outputs. During the training process, the
connection values are set and they all are stored in a memory
unit so it can be used to return the address of the objective
data.
Figure 4.1: Proposed CAM 4T XNOR Schematic Design in
Cadence tool
If logic0 appears as output of the match line, it shows that
there is no match between input data and the data stored so
the match line gets discharged by using transistor which is
placed between the ground and match line.
Figure 5.1: Proposed SCN Based CAM ARRAY schematic
design in Cadence tool
In network update, the entire SCN based classifier will retain
with the entries is not much required when an update request
is given in SCN-CAM [1]. Fig.5.1 shows the proposed SCN
Based CAM Array schematic design. It consists of 9inputs
and four enable outputs. This technique is used for more
powerful than parallelism process. Due to both BCAM and
SCN techniques, there will be reduction in complexity.
Because of SRAM cell, there will be less clock rate and less
power. CAM produces high speed in searching applications.
The SCN is used for Set Implementation and Data Mining.
Simulation Results
The simulation results of above designs are shown below in
the Fig. 6(a) to 6(c). A simulation window appears with
outputs. The power consumption details are obtained by
power analysis. Depending on the input sequences assigned
at the input the output is observed in the simulation. The
outputs for three designs, CAM XOR cell, CAM XNOR cell
and SCN based CAM cell are obtained using Cadence tool in
180 nm.
Figure 6(a) shows output of CAM XOR cell, Figure 6(b)
shows output of CAM XNOR cell and Figure 6(c) shows
output of SCN based CAM cell.
Figure 4.2: Proposed CAM XNOR ARRAY Schematic
Design in Cadence tool
To neither form a CAM NOR match line all CAM NOR cells
connected in parallel. The classical NOR cycle search
operation shows in Fig.4.2. The process of CAM XNOR
ARRAY is similar to CAM XOR ARRAY but instead of
CAM XOR 9T cell we use CAM XNOR 4T Cell.
The performance of the proposed design is compared with
existing designs using post-layout simulations. The
technology used here is the CADENCE VIRTUOSO GPDK
180-nm CMOS process. The transistors of pulse generator
logic are designed at 120ps in pulse width, since pulse width
design is crucial to the accuracy of data capture and the
power consumption.
SCN Based Array
Associative memories can store data and returns them known
biased inputs. Sparse clustered Networks are introduced
recently where binary weighted associative memories which
drastically improve the data storage and capability of
returning data.
30
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 1 (2016) pp 28-32
© Research India Publications. http://www.ripublication.com
The input is generated through buffers to simulate the signal
rise and fall time delays. An inverter is included as a data
input buffer to facilitate direct output driving from the input
source.
The input of the CAM XNOR Array and SCN Based CAM is
loaded with a bit is also placed at the output. The operating
condition used in simulations is 500 MHz/1.0 V.
The comparison of the existing XOR CAM array design and
proposed XNOR CAM array design is given in the following
table. It compares significant parameters such as No. of
Transistors, No. of nodes and Average Power.
It shows that proposed design is better in all aspects than
existing design.
Table: Comparison of average Power Consumption
Figure 6(a): Simulation output CAM XOR Cell using
Cadence tool
Parameters
Cell Type
Existing
CAM Array
XOR
Proposed
CAM Array
XNOR
Cam Type
NOR
NAND
Supply Voltage
1.8 V
1.8 V
Technology
180nm
180nm
No. of Transistors
732
566
No. of Nodes
781
581
Avg. Power (μw)
3.51
2.01
Conclusion
The Content Addressable Memory (CAM) is designed and
implemented in existing CAM XOR Array and SCN Based
CAM XOR Array. The average power consumption and
number of transistor count should be reduced by proposed
CAM XNOR Array and SCN Based CAM XNOR Array. It
will be having much reduced power when compared to the
other two designs.
Figure 6(b): Power consumed by CAM XNOR Cell using in
Cadence tool in 180nm
By designing with this pulse width, the pulse generators can
function properly in all process corners. About latch
structures, every CAM cell is individually optimized to the
power.
At the same time due to the reduced number of transistor
count, it can reduce the delay oriented things also. These are
also reducing the overall power consumption. So this circuit
will be acting as good element when compared to other
CAM design. The performance of the proposed design has
been evaluated using CADENCE in GPDK 180 nm CMOS
process technology. In future work, the function will be
design and implement using CAM Matrix. To analysis and
reduces the average power consumption of CAM design and
to achieve high performance CAM design.
References
[1] Hooman Jarollahi,, Naoya Onizawa, Warren J.
Gross, 2015, “Algorithm and Architecture for a
Low-Power Content-Addressable Memory Based on
Sparse Clustered Networks,” IEEE Trans. VLSI
Systems., 23( 4).
[2] A.Agarwal et al., 2011, “A 128×128 b high-speed
wide-and match-line content addressable memory in
32 nm CMOS,” in Proc. ESSCIRC, pp. 83–86.
Figure 6(c): Simulation output- SCN Based CAM Cell
using in Cadence tool in 180nm
31
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 11, Number 1 (2016) pp 28-32
© Research India Publications. http://www.ripublication.com
[3] Y.-J. Chang and M.-F.Lan, 2007, “Two new
techniques integrated for energy efficient TLB
design,” IEEE Trans. VLSI Syst.., 15(1), pp. 13–23.
[4] H. Chao, 2002, “Next generation routers,” Proc.
IEEE., 90( 9), pp. 1518–1558.
[5] N.-F. Huang, W.-E.Chen, J.-Y.Luo, and J.-M. Chen,
2001, “Design of multifield IPv6 packet classifiers
using ternary CAMs,” in Proc. IEEE Global
Telecommun. Conf., vol. 3, pp. 1877–1881.
[6] M. Meribout, T. Ogura, and M. Nakanishi, 2000,
“On using the CAM concept for parametric curve
extraction,” IEEE Trans. Image Process., 9(12), pp.
2126–2130.
[7] M. Nakanishi and T. Ogura, 1996, “A real-time
CAM-based Hough transform algorithm and its
performance evaluation,” in Proc. 13th International
Conference on Pattern Recognition., vol. 2, pp. 516–
521.
[8] L.-Y. Liu, J.-F.Wang, R.-J.Wang, and J.-Y. Lee,
1994, “CAM-based VLSI architectures for dynamic
Huffman coding,” IEEE Trans. Consum.Electron.,
40(3), pp. 282–289.
[9] C.-C. Wang, C.-J.Cheng, T.-F.Chen, and J.-S. Wang,
2009, “An adaptively dividable dual-port BiTCAM
for virus-detection processors in mobile devices,”
IEEE J. of Solid-State Circuits, 44( 5), pp. 1571–
1581.
[10] B. Wei, R. Tarver, J.-S.Kim, and K. Ng, 1993, “A
single chip Lempel–Ziv data compressor,” in Proc.
IEEE ISCAS, pp. 1953–1955.
[11] S. Panchanathan and M. Goldberg, 1991, “A contentaddressable memory architecture for image coding
using vector quantization,” IEEE Transaction on
Signal Process., 39(9), pp. 2066–2078.
[12] K. Pagiamtzis and A. Sheikholeslami, 2004, “A lowpower content addressable memory (CAM) using
pipelined hierarchical search scheme,” IEEE J. of
Solid-State Circuits, vol. 39, no. 9, pp. 1512–1519.
[13] V. Gripon and C. Berrou, 2012, “Nearly-optimal
associative memories based on distributed constant
weight codes,” in Proc. ITA Workshop, pp. 269–273.
[14] H. Jarollahi, N. Onizawa, V. Gripon, and W. J.
Gross, 2012, “Architecture and implementation of an
associative memory using sparse clustered
networks,” in Proc. IEEE ISCAS, Seoul, South
Korea, pp. 2901–2904.
[15] K. Pagiamtzis & A. Sheikholeslami, 2006, “Contentaddressable
memory
(CAM)
circuits
and
architectures: A tutorial and survey,” IEEE J. of
Solid-State Circuits, 41(3), pp. 712–727.
32
Download