Using APEX 20KE CAM for Fast Search Applications Technical Brief 56 Altera Corporation 101 Innovation Drive San Jose, CA 95134 (408) 544-7000 http://www.altera.com https://websupport.altera.com August 1999, ver. 1 APEXTM 20KE embedded system blocks (ESBs) support content-addressable memory (CAM), a parallel processing memory that accelerates applications requiring fast searches through databases, lists, or patterns. CAM is memory technology developed from RAM. Most other memory devices store and retrieve data by addressing specific memory locations. However, in CAM, the system supplies the data and receives the data’s address, as shown in Figure 1. Figure 1. APEX Integrated CAM vs. RAM RAM CAM When using CAM, the system supplies the data, and CAM returns the address. CAM 2C Data Address A7 0 B3 1 2C 2 4F 3 Address When using RAM, the system supplies the address, and RAM returns the data. RAM 2 2 Match Flag High Address Data 0 A7 1 B3 2 2C 3 4F 2C APEX 20KE CAM offers an ideal solution for high-performance applications such as data compression, network switches, Internet protocol filters, and peripheral component interconnect (PCI) functions. In addition to performance advantages, integrated CAM in APEX 20KE devices provides flexible CAM sizes. Figure 2 illustrates typical applications using various CAM block sizes. Altera Corporation M-TB-056-01 1 TB 56: Using APEX 20KE CAM for Fast Search Applications Figure 2. APEX CAM Offers Flexible Block Sizes CAM Width APEX 20KE CAM Discrete CAM (Fixed Sizes) 1,024 × 128 8,192 × 80 256 × 64 512 × 64 2,048 × 64 4,096 × 64 16,384 × 64 4,096 × 32 Small-Sized CAM Applications n Switch Address Mapping n Packet Header Identification n Pattern Recognition n Internet Protocol Filter n Cache Tag Medium-Sized CAM Applications n MAC Address Look-Up in Layer 2 Bridges & Switches n Address Translation or Protocol Conversion in Gateways n Layer 3 Longest Match Address Look-up for IPv4, DECNET & Appletalk n VPI/VCI Translation in ATM Switches Large-Sized CAM Applications n ARP Cache in Network Server n Layer 2/4 Flow Recognition for Fair Bandwidth Sharing n Layer 3 Address Caching for IPv6 Switch CAM Depth CAM vs. RAM in Memory Search Applications Memory applications, which often involve searching, have previously been implemented in programmable logic devices (PLDs) using RAM. Searching for an item in RAM can take many clock cycles. The latency of the search depends on the depth of the RAM block; a 64-word × 32-bit RAM block requires up to 64 clock cycles to find the data. Identifying an item stored in memory by its data content rather than its address can be more efficient. CAM works this way, making it ideal for high-speed search applications. CAM simultaneously compares the data requested against a list of entries, providing an order of magnitude reduction in search time over RAM. Other memory algorithms, such as binary- or tree-based searches, or look-aside tag buffers, perform a multi-cycle search through the memory space and are much slower than CAM. To better understand the performance advantages of CAM, compare the total time required to search an item using both RAM and CAM implementations. Locating an item in a 32-word × 32-bit RAM block running at 125 MHz requires up to 256 ns (32 clock cycles × 8-ns clock cycle). In contrast, the total time required to find an item in a similar-sized CAM block is 4 ns (1 cycle × 4-ns clock cycle). CAM is up to 98% faster than RAM and has a latency of one clock cycle compared to a maximum of 32 cycles for RAM. 2 Altera Corporation TB 56: Using APEX 20KE CAM for Fast Search Applications APEX CAM Integration Enhances Performance Traditionally, search applications use discrete CAM, where CAM is implemented as a separate device. A designer must add an individual CAM device to the printed circuit board (PCB), which increases design time and reduces the amount of usable PCB space. Discrete CAM also reduces system performance because it introduces additional onand off-chip delays. APEX CAM provides higher performance than discrete CAM by eliminating on- and off-chip and PCB board delays. Also, APEX CAM—which is manufactured on a 0.18-µm process—has a much faster access time (4 ns) than typical discrete CAM, which is generally manufactured on older processes, resulting in a slower access time (20 ns). Figure 3 illustrates the system performance advantages of APEX CAM over discrete CAM. Figure 3. APEX Integrated CAM Provides Superior Performance Discrete CAM System Performance APEX CAM System Performance PLD APEX 20KE-1 LUT Discrete CAM Register LUT Register CAM Register t D = 1.0 ns t LAD = 4.0 ns tCO = 3.7 ns Register t CO = 0.2 ns t D = 1.0 ns t SU = 0.7 ns Delay = 0.2 ns + 4.0 ns + 0.7 ns = 4.9 ns tACC = 20.0 ns tSU = 2.5 ns Delay = 3.7 ns + 1.0 ns + 20.0 ns + 1.0 ns + 2.5 ns = 28.2 ns In applications that require a CAM block larger than what is available in an APEX 20KE device, you can use the embedded APEX CAM to cache a larger, discrete CAM block. This implementation of APEX 20KE CAM accelerates large CAM applications, as shown in Figure 4. If the APEX CAM that interfaces with the system finds a match in one clock cycle, the system immediately proceeds. Otherwise, the system waits two or three clock cycles for the external CAM to find a match, and then proceeds. Figure 4. APEX CAM as Cache for Large, Discrete CAM APEX 20KE Device Large External CAM tACC = 20.0 ns 100-MHz System Data Data ESB CAM tLAD = 4.0 ns Data Address Address Altera Corporation Address 3 TB 56: Using APEX 20KE CAM for Fast Search Applications APEX CAM Provides Flexibility APEX CAM offers variable CAM sizes. One APEX ESB can be configured as a 32-word × 32-bit CAM block, and multiple ESBs can be cascaded together to implement wider and deeper CAM blocks. You can create any CAM depth or width as long as there are additional unused ESBs. For example, if all ESBs in an EP20K1500E device are used for CAM, you can create a 228-Kbit CAM block with varying widths or depths (e.g., 7,296 words × 32 bits or 3,648 words × 64 bits). Table 1 lists the APEX ESB resources available. Table 1. APEX Device Densities & ESB Resources Feature EP20K60E EP20K100E EP20K160E EP20K200E EP20K300E EP20K400E EP20K600E EP20K1000E EP20K1500E Maximum System Gates 162,000 263,000 404,000 526,000 728,000 Logic Elements 2,560 4,160 6,400 8,320 11,520 16,640 16 26 40 52 72 104 16,384 26,624 40,960 53,248 73,728 106,496 ESBs Maximum CAM Bits 1,052,000 1,537,000 1,772,000 2,524,000 24,320 38,400 54,720 152 160 228 155,648 163,840 233,472 Conclusion The APEX family is the first to offer integrated CAM in a PLD. This advanced feature provides system performance benefits to designers by simplifying functions that require searches through lists or data tables. With the additional benefits of enhanced system performance, effective resource utilization, and inherent configuration flexibility, APEX devices offer a spectrum of CAM block sizes, providing superior integration benefits over competing devices. ® 101 Innovation Drive San Jose, CA 95134 (408) 544-7000 http://www.altera.com 4 Copyright 1999 Altera Corporation. Altera, APEX, APEX 20K, APEX 20KE, EP20K60E, EP20K100E, EP20K160E, EP20K200E, EP20K300E, EP20K400E, EP20K600E, EP20K1000E, and EP20K1500E are trademarks and/or service marks of Altera Corporation in the United States and other countries. Other brands or products are trademarks of their respective holders. The specifications contained herein are subject to change without notice. Altera assumes no responsibility or liability arising out of the application or use of any information, product, or service described herein except as expressly agreed to in writing by Altera Corporation. Altera customers are advised to obtain the latest version of device specifications before relying on any published information and before placing orders for products or services. All rights reserved. Altera Corporation