~j'TfORIAL SERIES More circuits will continue to be placed on smaller chips, resulting in modular building blocks with better performance, morefunctionality, higher yields, and lower costs. LSI/VLSI Bilding Blocks James R. Tobias Honeywell, Inc. California Institute of Technology The digital electronics industry is engaged in a revolution, making it possible to construct sophisticated circuits that perform very complex functions in very small packages. Out of this revolution has evolved the current trend of placing more and more circuits on a chip. As a result, modular building blocks for electronic systems are becoming more functional, requiring the system designer to plan system functions as functional blocks rather than as discrete devices or other elementary blocks. This continuing trend-exemplified by the introduction of largescale integrated, or LSI, circuits that implement complete computers-will be further accentuated by very largescale integration, or VLSI, development. While not rigidly formulated, the definition of a VLSI circuit is usually given in terms of the number of logic gates that can be included on one chip. Thus, a chip containing 10,000 devices is considered an LSI chip, while one containing 100,000 devices is considered a VLSI chip. LSI circuits are being sold and used today, and VLSI circuits will appear in the near future. Today, two approaches are emerging regarding the use of functional building blocks in constructing LSI and VLSI systems. The most visible approach uses microprocessors and their support chips. These general-purpose circuits, made functional through programming, are today's most popular LSI building blocks. Many applications, however, can make more effective use, functionally and economically, of special-purpose chips. It is now possible and will become increasingly popular to design custom or special-purpose ICs by using design automation techniques. Consequently, this article includes a discussion of the building blocks that can be used in custom design. Capsule history of the IC industry. A brief look at past developments can put the design of LSI circuits into August 1981 perspective as well as help us understand what the future will bring. 1'2 The first development was the invention of the transistor in late 1947 at Bell Telephone Laboratories. William Shockley, John Bardeen, and Walter Brattain built the first transistor and measured its power gain. Further work at Bell Labs and other research laboratories defined the basic operating principles of solid-state devices. The knowledge of how transistors worked and the development of enhanced methods for growing pure germanium and silicon crystals established the solid-state industry. The next significant event leading up to the invention of the integrated circuit occurred at Fairchild Semiconductor where the first planar transistor was invented and manufactured. With improvernents in the planar process, it was merely a logical extension to put more than one component on a chip to form a circuit. Then came the technological battle of the 1960's in the field of integrated circuits. The early ICs were bipolar, like the transistors of the 1950's, and depended upon transistor action taking place in the body of the semiconductor. Nevertheless, the first LSI circuits were constructed from field effect transistors, or FETs, which are dependent upon the effects of surface phenomena. The major difficulty in building FETs was the prevention of surface contamination. Two small companies, General Microelectronics and General Instruments, tackled this problem. By 1970, the MOS integrated circuit was a viable product. It exhibited lower power consumption than bipolar circuits and required fewer processing steps to manufacture. In 1969, Intel introduced the first LSI semiconductor memory, and they have since paced the field with high-density memories and microprocessors. Although science led to the invention of the transistor and improvements in process technology led to the inven- 0018-9162/81/0800-0083S00.75 © 1981 IEEE 83 tion of the integrated circuit, it was the product definitions of the semiconductor memory, the pocket calculator, and the microprocessor3 that actually caused the LSI circuit to become a reality. No doubt, the ability to put even more circuits on a chip will occur. The need for VLSIbased products, however, has yet to be fully described. Future product definitions will make VLSI circuits as commonplace as LSI circuits are today. Various technologies and properties Today's ICs are fabricated on silicon wafers. The process steps-applying photo resist, exposing the resist, developing, etching, ion implanting, diffusion, annealing, etc.-build up the circuits either as collections of junction transistors (bipolar) or field effect devices (MOS). The continuing trend in fabrication is toward reducing the size of device features, thereby making it possible to put a larger circuit on the same die area. This trend from LSI to VLSI is being paced by NMOS technology. The development of very small (two micron minimum features) geometries for NMOS circuits has made it possible for several manufacturers to introduce 64k-bit memories and 16-bit microcomputers. Bipolar and MOS integrated circuits exist today and will persist into the future. Bipolar circuits offer the advantages of high speed for digital circuits and high single stage gain, high bandwidth, low offset, and good linearity for analog circuits. MOS circuits have the advantages of low power and good noise immunity for digital circuits, and high input impedance and low power for analog circuits.4 Bipolar. The primary contemporary use of LSI circuits is in digital systems. Bipolar technology is used principally in high-speed digital circuits. The major circuit types are TTL, ECL, and 12L. A brief description of each of these circuit types is given below. Values for gate speeds and power are based on similar fabrication technologies and device feature sizes of about five microns. All of these circuits use bipolar transistors, of which a typical cross section is shown in Figure 1. The two methods of electrically isolating the transistors are called diode isolation and oxide isolation, as shown in the figure. Oxide isolation can significantly reduce the unused area between transistors, this method can contribute to increasing circuit density as required by VLSI. The penalty is in processing difficulty-that is, an increase in the number of processing steps. TTL. Transistor-transistor logic, or TTL, is one of the most popular logic families, providing a balance between high speed and low power. Many TTL functional building blocks are available in both small scale integration, or SSI, and medium scale integation, or MSI, implementations. These parts continue to be significant because they provide the means for interfacing LSI chips with almost all other technologies. Gates, flip-flops, registers, decoders, counters, adders, multipliers, and many other TTL units are available. TTL LSI circuits are primarily memories and bit-slice microprocessors. The basic gate from which more complex digital circuits are constructed is shown in Figure 2. As shown in the figure, multiple inputs are provided by a multiple-emitter transistor structure. Even though the transistors operate in satu)ration, TTL is a high-speed technology. Standard medium-speed TTL can operate with 12-ns gate delays and at 20 MHz clock rates. Another form, low-power TTL, uses higher resistance values and no diode in the output circuit. The power dissipation per gate is reduced to I mW, compared to 12 mW typical for standard TTL, and speed is reduced to a 33-ns gate delay. To achieve higher speed, a Schottky barrier diode can be connected between the base and collector terminals of Figure 1. (a) Diode isolated bipolar transistor, and (b) oxide isolated bipolar transistor. Figure 2. Standard TTL NAND gate. 84 COMPUTER the output transistors. This keeps the transistor out of saturation, thus increasing its speed. Typical gate delay is about 3 ns, and power dissipation is 20 mW per gate. If higher resistances are used in the gate, reduced power consumption is achieved. This form of logic is called lowpower Schottky TTL. The elimination of transistor saturation allows switching speeds to approach those of emitter-coupled logic. gain, making the gate switch at a lower base current, thus improving switching speed. Another way to increase speed is to use some form of isolation to reduce the lateral migration of electrons. A third technique is to use oxide side walls for isolation; penalties associated with this technique are increased processing costs and larger surface area per gate. Another method of improving switching speed is to use Schottky diodes in the collectors of the output transistor, as shown in Figure 5. ECL. Emitter-coupled logic, or ECL, is a nonsaturatThe Schottky diodes do not keep the transistor out of ing logic family. Since the voltage swings are small, this is saturation as in Schottky TTL circuits, but they do reduce also the fastest type of logic. A basic ECL gate is il- the logic swing to 0.3 volts. Another effect of using lustrated in Figure 3. Schottky diodes is elimination of multiple collectors, To avoid saturation, the emitter current is controlled by connecting the emitters of the input transistors to a current source and the emitter of a control transistor with a fixed base reference voltage. Current flows through the control transistor or through one or more input transistors, depending on input voltages. When the input logic levels are such that one or more of the input transistors is on, the control transistor is off. If all the inputs are lower than the reference voltage, the control transistor is on, conducting all the current. Logic states of the output change when current is switched. This is the basis for the alternate name for this technology: current-mode logic, or CML. This type of gate has switching speeds of about one nanosecond and power dissipations of 50 mW per gate. ECL gates are used where high speed is essential in such applications as mainframe computers, high-speed memories, and digital communications. 12L. Integrated injection logic, or 12L, was invented to meet the need for circuits with a higher density than was possible using conventional bipolar technology. The bipolar IC consists of individual transistors, resistors, and (rarely) capacitors interconnected through surface metallization and electrically separated from one another by isolation islands. Such isolation islands, however, waste area. Integrated injection logic or merged transistor logic is a bipolar transistor IC technology that does not require isolation islands for every transistor. Many of the interconnections between circuit elements are accomplished laterally, parallel to the wafer surface, through the P and N regions of the chip. 12L circuit density can exceed that of MOSFET technology, making it eminently suitable for LSI and VLSI. A typical 12L NOR gate is illustrated in Figure 4. Note the multiple collector terminals (points 2 and 3). The gate consists of a NPN transistor and a PNP current source (injector). A logical 0 level is the saturation value of the collector-emitter voltage drop (about 0.5 volts) of the NPN unit. A logical I level is the base-emitter saturation drop, about .75 volts. Thus, 12L has atypical logic voltage swing of .7 volts. The switching speeds for a typical gate are around 10 ns, but the power dissipation is many orders of magnitude less than that of TTL, nominally 60-70 mAW per gate. Schottky 12L. Several techniques have been developed for improving the performance of 12L logic circuits.5-8 One technique uses ion implantation, a controllable method of doping the silicon wafer to increase transistor August 1981 Figure 3. Basic ECL gate. Figure 4. Basic 12L NOR gate: (a) the 12L NOR gate circuit and (b) a cross section of the gate. 85 thereby saving chip area. Collector isolation is provided by the rectifying diodes, although combining multiple collectors with Schottky diodes has the advantage of reduced leakage current between diodes. Because of the reduced voltage swing, switching speeds of Schottky 12L logic can be made significantly lower than 10 ns with a power dissipation of less than that of the standard 12L configuration. MOSFET. Field effect transistors have had the biggest impact on the drive to VLSI circuits. MOS IC technology improvements led the field in circuit density and started the movement toward high-density circuits. Cross sections of p-channel, n-channel, and complementary MOS transistors are shown in Figure 6. Figure 5. Basic Schottky 12L NOR gate: (a) represents the gate circuit, and (b) is a cross section of the gate. PMOS. The advantage of the p-channel MOS, or PMOS, transistor is its simplicity of fabrication, which requires only four mask steps. Because of the higher mobility of electrons, however, n-channel MOS, or NMOS, transistors can produce faster digital circuits. With improvements in IC processing such as ion implantation, silicon self-aligned gates, and depletion-mode load devices, NMOS has become the most widely used technology in the LSI industry. NMOS. N-channel technology, although more difficult to fabricate than PMOS, can outperform PMOS logic because electron mobility is 2.4 times higher than that of holes. NMOS outperforms PMOS in both speed and power. The basic circuit element is the inverter, shown in Figure 7. The input transistor is an enhancement-mode device with a one-volt threshold, and the pull-up transistor is a depletion-mode device with a - 2.5-volt threshold. A depletion-mode transistor has a significant drain-source current with zero gate-source voltage. The use of the depletion-mode active pull-up eliminates the need for a second supply voltage. N-channel depletion-mode transistors are easier to make than the corresponding P-channel units because the positive ions normally present in the oxide layer attract electrons and repel holes. With NMOS circuits, positive logic is used with the logic levels of 0.5 V and 5 V. Basic NAND and NOR gates, shown in Figure 8, are easily constructed from the simple inverter circuit. These circuits can typically switch at speeds of 50-ns gate delay. The power dissipation is about 0.2 mW when Figure 6. Cross sections of three MOSFET transistors: (a) PMOS, (b) Figure 7. Basic NMOS inverter. NMOS, and (c) CMOS. 86 COMPUTER the gate is on and virtually zero when the gate is off. This interesting property of no current flow when the device is off and the ability of the MOS transistor to store charge on its gate makes several types of logic circuitry possible that cannot be easily done with bipolar technologies. The NMOS transistor can be used as a switch (pass transistor or transmission gate) wherein the signal is only transmitted through the transistor if there is a signal on the gate. This series circuit element can be used effectively in holding a signal until the system clock activates the gate, allowing the signal to propagate through the pass transistor in synchronism with the rest of the logic. Dynamic circuits use the gate of MOS transistors to store a charge which controls transistor state. These types of currents are called "dynamic" because of the need to refresh the charge if it is to remain on the gate for more than a few milliseconds. Since the transistor gate has some leakage current, the charge will disappear after a time interval. If the device is a computer memory element and the data is intended to remain on the gate for an extended period of time, provisions have to be made to refresh the data periodically. Additional MOS properties, such as HMOS and VMOS, have come into being as various manufacturers have refined the process to achieve better performance and higher densities. Figure 8. Basic NMOS (a) NAND and (b) NOR gates. CMOS. Complementary MOS, or CMOS, technology uses both PMOS and NMOS transistors.3 Typical logic gates are shown in Figure 9. An advantage of CMOS is that there is virtually no power dissipation when the gate is in either state because one or the other transistor is in a nonconducting state. Therefore, this technology is used when low-power dissipation is required. The disadvantage of CMOS is that the transistors require isolation wells and channel stops to keep the transistors electrically separate-a requirement that makes CMOS circuits more wasteful of surface area than the equivalent NMOS logic. Technological comparisons. The speed/power product is a method of comparing MOS and bipolar technologies. One way to reduce the logic gate propagation delay is to increase current levels, since high currents can sweep out stored charges faster. Unfortunately, this increases power consumption. This characteristic makes the power/delay product (dissipation x delay) a meaningful parameter. It has units of power x time, or energy, usually measured in picojoules. A low power/delay product means the gate can switch at a high speed with low power consumption. Figure 10 provides a summary of the performance of today's NMOS, 12L, and ECL designs and their projected performance when the geometries are .5 mgm.9 The speed of MOS and bipolar gates is limited by transit time of the carriers crossing the active region of the transistor (base or gate), the gate configuration, and the parasitic and interconnect capacitances. An interesting property of 12L gates is that the power/ delay product is constant over a wide range of injector currents. Practical injection currents have a range of several orders of magnitude. By simply adjusting the external resistor or the supply voltage one can control the operating speed at the expense of power consumption. August 1981 Figure 9. Basic CMOS (a) NOR and (b) NAND gates. 87 Today's NMOS technology, sometimes called HMOS, can yield devices with 2-Mm gate lengths, giving a gate delay near I ns and a power/delay product of I pJ. This is quite an improvement over the 10-ns delay of the older 5-Mm geometries. As the technologies improve, the gate delays should be reduced to .I ns. At that time, the differences among technologies will not be as clear-cut as they are today. Raw gate speeds portray only part of the performance picture, since circuit topology also has to be examined. As transistors are scaled down in size, their performance improves; because the interconnection wiring is similarly scaled, the delay does not change. For example, assuming resistivity does not change and all conductor dimensions are proportionally scaled, the current density increases as circuit size decreases, but the RC delay does not change Figure 10. Powerldelay product of several IC technologies. Figure 11. Block diagram of a microcomputer. 88 for a unit length. Therefore it takes the same time for a signal to traverse between circuits. Furthermore, the increasing current density can lead to a problem with metal migration. If future technologies are to achieve improved performance, the interconnection wiring will have to be constructed of lower resistance alloys or thicker conductors. Future designs will become even more constrained by wire sizes. In some instances, reducing the transistor size will make no difference in circuit density because wiring density will be the utilization constraint. LSI/VLSI building blocks Several different categories of LSI/VLSI building blocks are available to the system designer. Although contemporary VLSI units are large microprocessors and memories, this situation will change as the ability to make VLSI chips becomes commonplace. The building blocks can be grouped into the following four sets: * Standard building blocks: microprocessors, memories (RAMs, ROMs, and field-programmable ROMs), and highly functional peripheral support chips, * Dedicated or special-purpose circuits, * Semi-custom or programmable building blocks: field-programmable and mask-programmable logic, and * Custom building blocks: silicon foundry and design automation/computer-aided design for LSI/VLSI. Presently available are highly functional circuits such as counters, decoders, and multiplexers that can be connected into a variety of special-purpose circuits. Since these circuits usually contain fewer devices than are contained in microprocessors and memories, many different chips are needed to build a typical system. The increasing demand for different products has generated an economic environment that encourages the creation of programmable LSI and VLSI parts. A standard unit can satisfy a great variety of needs because its function can be changed through programming. The microprocessor and its support chips are presently the most important forms of LSI/VLSI building blocks. Considerations working against the use of microprocessors include the high cost of software for program development and their reduced speed in comparison to custom random logic. The semantics used to describe microcomputer systems is illustrated in Figure 11. This block diagram shows a microcomputer consisting of a microprocessor, memory, and interface devices communicating through a bus. The microprocessor performs the processing function which includes fetching, executing, and moving instructions and data. The ROM contains the program or instructions for the microprocessor, and the data or other changeable information is stored in the RAM. The interface with the world is through the 1/0 block. All of these units exchange information with each other via the data bus. These transfers are coordinated by the microprocessor through the control bus. The connections from the chips to the bus are often driven by tristate circuits which can effectively disconnect a chip from the bus by going COMPUTER into a high impedance state when the chip is not communicating with the microprocessor. This prevents overloading the microprocessor's drive circuits. When the microprocessor wants to communicate with a chip, it sends out an address which is decoded into a chip select or enable signal to the desired chip, effecting a communication path. Microprocessors. Because of the versatility and low cost of microprocessors, the growth in the number and types of these devices in the past few years has been spectacular. The major types of microprocessors are four-bit and smaller devices, eight-bit microprocessors, eight-bit microcomputers, 16-bit microprocessors, and bit-slice machines. The first -microprocessors were four-bit machines. As circuit density increased, it became possible to build microprocessors with larger word sizes and with on-chip memory and I/O interfacing circuits. Although most microprocessors produced today are based on NMOS technology, bipolar technology is used in some cases, notably in bit-slice units. Four-bit devices. Four-bit microprocessors continue to be widely used. Their functionality has been expanded by including memory and interface circuits on the processor chip. Application areas include those of high volume and low cost, such as appliance controllers and toys. A typical chip might include 160 four-bit words of RAM data storage and up to 2048 instruction words in ROM. Once the ROM programming masks are paid for, the cost of a typical device is in the range of $1 or $2. Principal implementation technologies include PMOS, MNOS, and CMOS. Examples of these machines include the Texas Instruments TMS 1000 series, the Rockwell PPS/4, and the National Semiconductor NMOS COPS series. Eight-bit microprocessors. Due to the widespread applications of eight-bit microprocessors, they have become the workhorses of the industry. They are ideal for solving problems that combine reasonably low cost with rather sophisticated computing requirements. Control and moderate-speed computation are typical applications. Personal computer systems use the eight-bit microprocessor as a basic building element. Typical eight-bit microprocessors include the Intel 8080, the newer Intel 8085, the Zilog Z80, the Motorola 6800, the MOS Technology 6502, and the RCA 1802. With the exception of the RCA unit, all are constructed with NMOS technology (the RCA 1802 uses CMOS for low power). As NMOS technology improves in logic gate switching speeds, newer models will operate at higher instruction execution rates. Microprocessor and other equipment vendors offer extensive hardware and software development support equipment. The designer must give strong consideration to the availability of such support. Although hardware is inexpensive, software development costs can be very high. The programmability of microprocessors makes it possible for vendors to produce a common (therefore high-volume) part at low cost. The specialization then falls on the user/designer, and his development costs can increase significantly if proven methods of software development are not used. August 1981 Eight-bit microcomputers. Compact and low-cost microprocessors make it possible to use computer technology instead of hard-wired random logic circuits. Construction of a fully operable microcomputer used to require several chips, including the microprocessor, clock, memory, and I/O interface chips. Processing improvements yielding large circuits with extremely small feature sizes have made it possible to combine the microprocessor and other support circuits on a single chip. This singlechip microcomputer can be purchased in various versions which can include analog-to-digital or digital-to-analog converters, timers, or other special-purpose I/O circuits. Processing improvements yielding large circuits with extremely small feature sizes have made it possible to combine the microprocessor and other support circuits on a single chip. The program memory in such units is usually a maskprogrammable (at the factory) ROM, but each manufacturer offers a model that has some method for changing the program in the field-an especially important feature during software development. The field-programmable models either have an electronically programmable ROM rather than the mask-programmable ROM or have no memory at all, providing external connections so a programmable memory can be conveniently interfaced. These devices can solve many problems requiring very low cost and complex digital logic. The program memory can hold 1 K or 2K bytes of program, with the interface circuitry contained on the same chip. The cost per unit in high volume is usually under $10. As photolithography techniques are further improved, this type of device will be expanded to include more functions. Sixteen-bit microprocessors. Because eight-bit microprocessors presently tend to dominate the microprocessor market and have provided the primary impetus for developing LSI and VLSI processing technology, one tends to forget that 16-bit microprocessors have been around for quite some time. Most of the 16-bit microprocessors offered only marginal performance improvement over eight-bit units, so they were used primarily in a few applications where the longer word-length provided special advantages. Continued improvements in circuit speed and density have resulted in a new category of 16-bit processors, including the Intel 8086, the Zilog 8000, the Motorola 68000 using NMOS technology, and the Fairchild 9445 using bipolar technology. Processors in this category offer up to ten times the performance of their eight-bit counterparts. The word-length of a microprocessor does not limit operation to only this fixed length. For example, the 6800 and the Z8000 can conveniently operate on individual bits, BCD nibbles (half-bytes), bytes, 16-bit words, or 32-bit double words. Many other hardware features are offered by various 16-bit microprocessors, including hardware multiply and divide, the ability to address 1 6M bytes of memory, and pause/run control inputs for con89 venient use in distributed multiprocessor networks. Additional features such as traps, multiple stack pointers, and memory management units make it convenient to compile and execute programs written in higher level languages. Bit-slice machines. In this type of microprocessor, several identical chips are connected as an array to form a complete processor. The basic bit-slice building block is a chip that has a slice (usually four to eight bits) of the arithmetic and logic unit, or ALU, and several generalpurpose registers. These chips can then be hooked up in parallel to build a computer of any desired word-length. Based on Schottky TTL or ECL technology, they are used for fabricating high-speed, high-performance processors. The Advanced Micro Devices 2900 family of chips consists of two register/ALU chips, three microprogram sequencer chips, and other miscellaneous support chips. Flexible and high-performance microcoded processors are built from these chips and a few ROMs. At least eight other IC manufacturers produce the 2900 series of parts. Since the control of the processor can be designed with an arbitrarily wide microcode word, many operations can be done during a given clock cycle with very high performance potential. Since memories have a regular and repeated circuit structure, they are ideal vehicles for developing new LSI/VLSI processes. More elenmentary than the more common MOS microprocessor units, bit-slice chips are specially tailored for applications such as high-performance mainframe computers, high-speed communications controllers for computer I/O or network interfaces, and signal processors where high speed and throughput are primary considerations. ories, or RAMs-are used as temporary data storage or "scratch pads." Read-only memories, or ROMs, are used to store programs or permanent data. RAMs. Read-write semiconductor memories are available as either static or dynamic units. The static RAM memory cell is basically a transistor flip-flop. It contdins active elements that, once loaded, will remain in a stable state until the unit is either rewritten or deenergized. The dynamic memory cell uses the gate of a single MOS transistor to store a charge whose presence or absence represents the logic state. Since the charged state,;- like any capacitor, will slowly discharge due to inevitable current leakage, the charge has to be restored or refreshed every few milliseconds. The word "dynamic" is used in describing this memory cell because it signifies this need for periodic refreshing. Static memories are very easy to use. One needs only to connect them to the address, data, and control buses and apply power. Dynamic memories are somewhat more difficult to use because they require additional circuitry for data refresh. However, dynamic memories have very high bit density since the dynamic cell requires only one transistor. The largest dynamic RAMs available today contain 64K bits per chip, while the largest static unit contains 16K bits. Newer designs include refresh circuits on the memory chip. Some vendors have introduced special chips that include refresh and address multiplexing, thus enhancing the ease of use for conventional dynanlic RAMs. The market division between dynamic and static memory units has now reached the point where static memories are cornering the high-speed market while dynamic memories control the low-cost market. For memory systems over 16K bytes, dynamic RAMs have a much lower cost due to the larger number of bits per chip and the lower circuit overhead for refresh. ROMS and PROMs. Read-only memories, unlike RAMs, are nonvolatile-that is, they retain data when power is removed. The term "ROM" usually refers to a Memories. Since memories have a regular and repeated memory whose contents are established during nmanufaccircuit structure, they are ideal vehicles for developing ture. ROMs that can be programmed by the user are new LSI/VLSI processes. The large number of repeated called programmable ROMs, or PROMs. Circumstances cells reduces the circuit complexity to an intellectually and the application determine the kind of ROM selected. manageable level, allowing circuit design to be done A mask-programmable ROM is the best choice for highwithout a sophisticated design automation system. The volume, fixed-purpose systems. The single-unit cost can insatiable demand for system designs with more and more be very low, less than $5, but the charge for making the memory provides impetus for developing IC memory mask is significant, varying from $1000 for a small 1K chips with increased density and performance. As semi- ROM to $50,000 for a 64K-bit ROM. If a large quantity is conductor manufacturers push to higher speed, higher to be manufactured, the mask charge per device is indensity, and lower power, they use the demand for larger significant. The reasons for using a PROM are the economics and and, cheaper memories to establish new process parameters and design rules, which can then be applied to build- time savings incurred by being able to quickly program it without delay for mask development and manufacturing ing less regular structures such as microprocessors. Areas of improvement in memory technology include cycle turnaround. The cheapest PROM is the fuse-proincreasing the bit density, decreasing the access time, im- grammable ROM, in which each bit is shunted by a "fuse" proving yields, reducing the power supply complexity, to ground potential. Programming is accomplished by and achieving pin compatibility among RAMs, ROMs, passing a large current through each selected bit location, and PROMs so they can be interchanged at will. Mem- thereby blowing the fuse to create a logical value of one. ories can be grouped into the two major categories of Such units can be programmed only once. If a mistake is read-write memories and read-only memories. Read- made or if a change is desired, the part has to be discarded write memories-traditionally called random access mem- and a new one programmed. 90 COM PUTER To circumvent this problem, the erasable PROM, or EPROM, was developed. This unit is programmed by applying high voltage (20-60 V) to each bit to establish a charge on the gate of a transistor, somewhat like the one used to store information in a dynamic memory cell. The EPROM, however, can retain its charge pattern for years. There are two types of EPROMS. One is packaged with a quartz window through which ultraviolet light can be shined to erase the memory. The other type requires a voltage pulse to erase the bits. Both techniques activate carriers in insulating layers to drain the charge from the gate. PROMs are most advantageous in situations requiring software or firmware changes, such as during program development or in products individually characterized for customers. Products are often introduced using PROMs and then switched over to mask-programmable ROMs as sales volume increases. The cost savings can be as high as 10 to 1 for the memory units, but engineering changes are very expensive if new mask-programmable units are needed. plement (signed numbers) and unsigned multiply. For example, TRW and AMD offer 8 x 8 TTL multipliers that provide a 45-ns nominal multiply time. Floating-point arithmetic chips are now available10'11 that convert a microprocessor into a respectable computational system. These units can enhance the handling of arithmetic without slowing down the microprocessor's nonnumeric operations. Offered by AMD, Intel, and other manufacturers, these dedicated chips include mechanisms for rather sophisticated communications with the microprocessor. They can execute a 32-bit multiply in less than 10 ms. When a floating-point operation is required, the microprocessor signals the arithmetic unit, transmits the operands, then returns to its other duties. When the math chip has completed its calculations, it sends a signal to interrupt the microprocessor and then communicates the result over the bidirectional data bus. Some chips are designed to control computer peripherals. Such units include floppy disk controllers, cassette controllers, CRT controllers, sound generators, and speech synthesizers. These units are often designed to interface with a particular family of microprocessors. Peripheral support chips. A very important consideration during microcomputer design is the interface to the outside world. IC manufacturers have designed a number The development of the codec was of LSI chips to make this interfacing easier. Examples are digital parallel or serial input chips, analog-to-digital coninstrumental in introducing LSI into verters, and digital-to-analog converters. Selection of the telecommunications. interface chip is influenced by the ease of connecting it to the microprocessor bus. Certain questions should be asked, such as Special-purpose circuits. Although many jobs can be * Does the chip have registers to hold the data? handled by microprocessors and their support chips, * Are the I/O pins tristateable? * Will it generate an interrupt when the function is some problems can be best solved by using an IC specifically designed for that application. One such area is telecomplete? communications. 12 Digital circuits are now beginning to Many manufacturers offer a variety of A/D and D/A invade this area, traditionally dominated by analog and converter units. Some are very high-speed with conver- electromechanical technologies, because digital signal sion times in the order of 30 ns. Others are slower but processing offers higher performance at a lower cost with lower in cost. There are also many types of special- reduced maintenance. The development of the codec purpose digital I/O chips available. Universal asyn- (coder-decoder) was instrumental in introducing LSI into chronous receivers/transmitters, or UARTs, and univer- telecommunications. sal synchronous/asynchronous receivers/transmitters, or At the transmission end of a channel, the codec encodes USARTs, are used for serial communications. Peripheral an analog voice signal into a pulse code modulation (or interface adapters, or PIAs, and versatile interface adapt- PCM) digital signal and then, at the receiving end, ers, or VIAs, are used for communications in which decodes the digital stream into an analog voice signal. parallel bytes are to be transferred. If an interface to a When the signal has been transformed into a digital data communications network is desired, there are chips stream, a digital switch can time-multiplex many voice that provide a selection of message protocols and include channels over a single wire, improving the network reautomatic baud-rate generation. source utilization. The codec chip has an A/D converter Math chips are also very useful. Many kinds of multi- that samples the voice signal at 8 KHz. The USA multipliers, dividers, and floating-point math chips are available plexing system, called the TI PCM carrier, transmits 24 to turn a microprocessor into a "number cruncher." Ap- channels which are sampled simultaneously and assemplications such as signal processing, which requires much bled into 24 eight-bit words transmitted at 1.544M bps. calculation, make good use of such units. The micro- The CCITT-recommended system for Europe is a 32-chanprocessor transfers numbers to the math processor and nel system with 30 channels for voice and two for superwaits for the results to return. The math processor can per- vision and alarms. form the calculations several orders of magnitude faster Other areas experiencing an influx of special-purpose than a software program on the parent microprocessor. circuits are tv and radio tuning circuits. Digitally conHigh-speed integer multiply chips come in several trolled, phase-locked loop circuits are also being designed forms, ranging from' 8 x 8 to 16 x 16 bit multipliers. and introduced into consumer products. Such circuits These units offer a number of options such as two's comn- offer programmable tuning with drift-free operation.13 August 1981 91 Semi-custom building blocks. Another form of programmable circuits for hard-wired random logic applications is either mask-programmed at the factory or fieldprogrammed by burning fused links in a manner similar to that of programming PROMs. Indeed, PROMs can be used to provide programmed logic. Logic inputs can be applied to the address lines, and the data stored in the PROM will be the logic outputs. Such circuits can offer large savings in both design time and board space because they contain several hundred or even several thousand gates. A system requiring many SSI and MSI chips can often be replaced by one semi-custom chip. Field-programmable logic. Two principal types of fieldprogrammable logic chips are programmable array logic, or PAL, and the programmable logic array, or PLA. A general layout of such logic arrays is shown in Figure 12. Product terms are formed in the AND plane and passed to the OR plane, where the generated function is output from the chip. The difference between the PLA and PAL is that the AND and the OR planes can both be programmed in the PLA, whereas in the PAL the OR plane is fixed and only the AND plane can be programmed. Field-programmable planes are constructed as a matrix of orthogonal crossing wires connected together at the intersections by a fusible link. The method of programming is to burn out the undesired connections by pulsing a large current through the specific interconnection. Since it may be necessary to provide multiple pulses to reliably "burn out" the fuse, programming should be verified before the chip is removed from the programming unit. Such devices are available from Monolithic Memories, Texas Instruments, and Signetics. The most popular circuit technology is Schottky TTL. Figure 12. Block diagram of programmable logic. 92 The general form of a logic circuit will have a large impact on its adaptability to custom VLSI design. For random logic, a "regular" circuit structure like the PLA or PAL significantly reduces the chip complexity. PLA circuit configurations are easily understood, and it is not too difficult to devise automatic means for producing masks. The disadvantages of using this circuit form are reduced speed and a larger chip area required by the matrix structure. These limitations are not serious in some instances. The quick design turnaround has a definite advantage for lowvolume applications. In these cases, timely market introduction has more impact than striving for the lowest cost, highest performance chips. Obviously, the need for computer-aided design tools is not diminished by VLSI.14,15 The need for computer-aided design tools is not diminished by VLSI. Mask-programmable circuits. Digital circuits in this category include master-slice and gate arrays, which are arrays of basic logic gates that the user can interconnect as desired by specifying the interconnect mask for the chip. A typical chip in this category is manufactured with 400-2000 gates on the chip, excluding the last level of interconnect. The design task is to specify the interconnection network and make masks for the metallization steps. Even though two levels of metallization are provided, allowing quite a bit of freedom and flexibility, generally only 60-70 percent of the available gates can be utilized for implementation of the specific logic function. This type of circuit design must be supported by a design automation system so that the most effective interconnection scheme can be determined in a reasonable time. The primary function of the design automation system is to help route the interconnecting wires. Another useful function is to assist in partitioning the logic into "islands" of few interconnections, contributing to maximization of the total utilization of the master-slice chips. Such a design automation system makes it possible to lay out the final metallization masks for use on wafers previously stockpiled with the gates already on them. The result is a fast turnaround time from logic circuit design to implementation on the IC chip. Master-slice chips are available in many technologies, including CMOS, PMOS, NMOS, TTL, 12L, and ECL. Bipolar circuits are used when high speed is required. Microcircuits Technology, Interdesign, International Microcircuits, Dionics, Motorola, and EXAR offer a variety ofmaster-slice circuits in digital and analog forms. The analog master-slice chip is typically an interconnection of components such as transistors, resistors, diodes, and capacitors. Since only certain standard components are available on the mastt. -slice chip-restricted to those deemed useful to a large cross section of users-optimum analog circuits often cannot be obtained through this approach. Nevertheless, fast turnaround and compactness may still make this an attractive alternative to discrete analog circuits. COMPUTER Recently, IBM16 developed for internal use a method for automatically placing MOS logic gate circuits in a fixed chip layout plan by using an automated placement system. Starting with the logic circuit diagram, the designer can construct a complete circuit by selecting a series of fixed gate configurations. The gates are placed into the fixed chip plan, using heuristic optimization to develop a workable circuit. The output of this sytem is a complete set of masks. This system gives the designer much flexibility, resulting in increased gate utilization and better performance. Custom building blocks. Custom building blocks are available to those who cannot use the standard approaches. Special custom circuits characteristically have a long and expensive design cycle. Product demands normally have to be very special for a commitment to be made to a custom design because lower risk methods of standard building blocks are available. The increased risk of custom building blocks, however, can be reduced by using a design automation or computer-aided design system. Most custom design is done by captive suppliers, but some custom fabrication houses offer design services to their customers. Design risks can be reduced and custom ICs can be more readily available in the future as computer-based design systems improve to the degree that most of the intricacies of IC layout are handled by the computer-if fabrication technology is offered on the open market. This development is currently exemplified by the silicon foundry. Silicon foundry. "Silicon foundry" is a phrase coined to describe a method, originated at the California Institute of Technology, 17 by which several universities and research institutions are obtaining custom LSI chips. Over 200 chips have been fabricated using this technique. The approach is based on the premise that as IC technology matures, more people will acquire the skill of circuit design, creating a demand for fabrication. This premise leads to the question, "If anyone could design an integrated circuit, where would he get it fabricated?" One answer is a silicon foundry. A silicon foundry accepts only the responsibility for producing the parts. The designer is responsible for designing, laying out, and testing the chips. To successfully implement such a foundry, the designer must know the design rules for the process and have the ability to transmit the detailed information of his design to the fabrication facility or foundry. The masks and parts would be manufactured at the foundry.'8 The four basic ingredients for successfully obtaining parts from a manufacturer are * Use of a standard process, * Use of a set of design rules that are as processindependent as possible, * Use of a standard format to transmit pattern data, wafers cost-effectively. Furthermore, the designer must have good design aids. The economic considerations are quite different from those of the high-volume and general-purpose chips supporting today's IC fabrication lines. The "black art" of making circuits work has to be virtually eliminated by using design rules that are somewhat forgiving. The commitment of a large circuit to silicon requires some method to ensure that the circuit will work with only one of two fabrication iterations. The required tools for the design methodology must include circuit simulation programs, layout aids, and methods of checking for circuit correctness and design rule violations during layout. The "black art" of making circuits work has to be virtually eliminated by using design rules that are somewhat forgiving. The success of Mead's work indicates that it is possible to train people to design LSI circuits that fit their special needs. Spreading the design capability to the people who understand the problems allows them to achieve high throughput and speed by using a circuit dedicated to the algorithms. 19 Design automation and computer-aided design for LSI/ VLSI. The VLSI circuit DA/CAD system must be able to aid the design process from various points of view. The total development scenario consists of many activities, including process, circuit, system, and test development. The complete DA/CAD system should be able to support all of these activities. One computer-based tool needed for process development is a simulation of the solid-state electrical properties produced by various processing steps, including ion implant and epitaxial growth. This analysis helps to speed process development by predicting the steps required to produce desired transistor and other device characteristics. Such a simulation package also helps establish design rules, producing the device characteristics used in circuit simulations. A number of tools can be used for circuit development. Small circuits such as memory cells and sense amplifiers may be simulated with elaborate electrical models which include second-order effects such as parasitic capacitance and other leakage mechanisms. Using such simulation, the circuit configuration for each cell can be optimized. Larger circuits such as registers, arrays, or controllers often cannot be simulated with detailed circuit analysis programs because of computer size limitations. Logic simulation in which gates are simplistically treated as switches having some delay time is used to predict the of larger logic circuits. Current research is behavior and * Use of standard test circuits interspersed among the leading to the invention and implementation of simulation programs that can combine logic level and detailed chips on the wafer. circuit models. The designer will have the ablility to The commercial success of such a venture requires that analyze troublesome circuits in detail and let other circuits the fabrication facility be able to handle small batches of be described by the simplest model that adequately desAugust 1981 93 cribes their performance. Behavioral descriptions-performance simulations of the system-level function-are also needed. At this level of abstraction, a variety of register transfer and hardware description languages have been developed. The system function is modeled in terms of the movement of data words in relation to clock timing. The overall architecture and instruction sets for the system can be described and their performance simulated. Some effort has been made in trying to implement hardware automatically from a system-level description, but these efforts have not been totally acceptable since they have usually contained 50 to 150 percent more hardware than the manual implementation. the input. The result is then monitored to see if the pattern is different from that of a known good circuit. This technique works well on those circuits that can be fully exercised with a few hundred input conditions. The other approach to testing places a shift register strategically in the signal path of a digital circuit, allowing access to the internal states of the module. Vectors can then be shifted in to stimulate the circuit, or data resulting from test vectors placed on the usual input terminals can be captured and shifted out for analysis. Whatever the selected test method is, it must be part of the VLSI circuit from its inception. Opportunities and limitations of integrated design The opportunities of VLSI are virtually boundless. Current research is leading to the invention This article mentions only a few of the existing applicaand implementation of simulation programs tions and circuits in computer technology, signal processthat can combine logic level and detailed and communications. In emphasizing the vast range ing, circuit models. pre- Once the circuits are designed, the layout can proceed. The topological structure of a small piece of circuitry is usually sketched and then transferred to a digital graphics system. This cell can be repeated, mirrored, rotated, and connected to other cells through interactive graphics. When full layout is completed, a large plot is made to check design errors. The computer representation of the layout is called the data base, and the computer makes the plot of the layout by interpreting the data base. Some software packages analyze the data base for electrical connectivity and design rule errors (less than minimum spacing between elements). The data produced by the graphics system can then be used by a pattern generation machine to produce the masks or reticles for circuit fabrication. The data-base size, proportional to the size of the chip, can be overwhelming. Proving that the layout is error-free is a monumental task even for MSI circuits. Checking by hand or by a computer program that looks at the full layout may not be effective for VLSI circuits containing random logic. The only way the VLSI problem can be handled is to partition the problem, probably through the use of hierarchical representations. LSI/VLSI technology necessitates design for testability. For SSI/MSI circuits, the number of gates per chip is small, and all of the various possible states or conditions of the circuits can be probed and tested rather easily. However, VLSI circuits force a rethinking of this problem. It is now possible to build a circuit with so many internal circuits and limited external connections that it would take centuries to run test cases to see if all the circuits are working. Therefore, a new body of science for testing has to be created. Present techniques for dealing with the problem can inexactly be classified as either signature analysis techniques or other mechanisms using serial shift registers. Signature analysis is a technique that uses a series of test bit patterns applied at the input terminal of a digital IC, and the resulting output patterns are exclusive-ORed with 94 of VLSI applications, one may note that even the viously analog-dominated field of telecommunications is being transformed into a digital field due to the improved speed and density of such circuits and the resulting lowering of costs. Other computation-intensive areas feeling this same impact are speech processing and computer graphics. An increasing rate of product offerings is occurring in these fields. Obviously, the need and use of VLSI circuits are only limited by our imaginations. One may ask, "What could limit this growth? What are the problems?" First of all, the skill to manufacture ICs with 1 Am or smaller feature sizes must be commercially developed. Several laboratories have already demonstrated that working transistors can be built with such small geometries. High-precision techniques must be refined and made repeatable. It is now only a matter of time before the ability to reliably construct incredibly large circuits becomes commonplace. The basic circuit topology problems of VLSI are related to the limits imposed by solid-state electrical properties and circuit interconnection wiring. As devices are made smaller, electric field strengths increase and dielectric thicknesses decrease, producing the phenomena of hot electrons and dielectric breakdown. Wiring miniaturization hinders progress by introducing the problems of electromigration and increasing intercircuit ohmic resistance. Increasing the number of circuits on a chip also increases the demand for a larger number of external connections to the chip. The packaging designer must face the twin challenges of more interconnections and increasing power dissipation per unit area. The real limitation of VLSI lies in the ability to manipulate complexity. Circuits can be so large that the human mind cannot possibly remember the design parameters of the network on the chip. The proliferation of VLSI chips will be limited unless a method of reducing this complexity to a manageable level is found and implemented. The solution will be approached through computer-aided design, or CAD, and design automation, or DA. The human mind can handle only five to nine things simultaneously. If the problem can be partitioned so that COMPUTER the designer only has to handle a few items at a time and the machine is given the numerous repetitive tasks, great strides will be made in designer productivity. The issue of the viability of VLSI is really a CAD/DA problem. Custom VLSI circuit technology cannot be useful if methods of dealing with the large number of components are not implemented. If systems are not developed to overcome the design complexity problem, VLSI can only be used for memories and certain fixed-architecture computers where the circuits are regular in topology. The design engineer cannot deal with the problems of design, fabrication, test, and debug without some design tools that enable him to build chips that are nearly correct the first time. Another issue arises because the combined knowledge of the system designer, the circuit designer, and the component designer is necessary to build VLSI. Either the use of large development teams or a method of splitting the task into intellectually manageable tasks is needed. Problems in unusually large and complex systems were first encountered by software programmers. The hardware of computers gave them a device capable of performing so many different tasks that there was no end to the functions they could construct. The general-purpose nature of the machine and the ease of changing the program created the illusion that anything could be done in a short period of time. A system could be specified at a high level, describing a lot of functions that were individually easy to implement. This illusion led to gross underestimation of the complexity level for the system task. There are many horror stories of the consequences of such endeavors. The lack of discipline or restrictions on system function definition can lead to a problem of such magnitude that it is not even practical to attempt to solve it. The software community has developed the ideas of hierarchical design, levels of abstraction, structured programming, top-down design, and go-to-less programming to aid in the management of complexity. These same ideas are the basis of the future CAD/DA systems that will make the use of VLSI an everyday reality. Table 1 lists the assumptions made in this design example. The cost per IC and the indicated design cost are indicative of current production designs.20 For the LSI/MSI design, it is assumed that a significant number of parts can be removed with a tailored design. For the custom LSI design, it is assumed that a variety of MSI parts can be coalesced. The hardware system production cost per unit, including prorated development, can be simply computed from the cost assumed in Table 1. Figure 13 shows the unit cost for each design as a function of the number of units produced. As shown in this example, specialized designs become cost-effective only when a sizable number of units are produced. Furthermore, the more specialized the design, the more risk involved due to the extended development times for custom ICs and the uncertainty of manufacturing volumes required for cost recovery. The example also illustrates two very important points. First of all, readily available design aids and the less expensive IC fabrication that results will allow a majority of Table 1. Design costs. DESIGN STANDARD MICRO LSI/MSI CUSTOM MICRO COST CMP CLSI = 30 x 10 = $300 20 x 10 + 20 x 500/N C = 10 x 10 + 10 x 500 + 100,000/N Economic factors in design. As IC technology continues to improve, various design costs become increasingly important in determining the most effective and economic way to design systems. In some instances, performance requirements dictate one or another approach; however, more than one alternative is usually available. Designing directly on silicon is only one such alternative. To emphasize the crucial nature of design cost in smallvolume applications, let us examine a simple example in which three alternatives are available: a standard singleboard, an LSI/MSI specialized' single-board computer, and a custom IC. For this example, a number of arbitrary, but not unreasonable, simplifying assumptions have been made for the sake of clarity. These include assumptions about parts count reduction, the cost of parts, the level of designer expertise, and the design time. In addition, a number of secondary factors-such as interest rates, second-source availability, package constraints, purchase contracts, and supplies-have not been considered. The intention in this example is to show how important prorated design costs are in limited volume applications. August 1981 Figure 13. Hardware system cost vs. number of units produced. 95 designs directly on silicon. Secondly, if only a few systems are to be made and a standard unit will do, the standard unit is the least expensive alternative for small-volume requirements because of the absence of design costs. The only realistic way to reduce fabrication costs is to share part of the expense with other developers. There are two ways this can happen. One is to use mask-programmable chips (gate arrays). The other is to allow more than one chip on a wafer mask. Gate arrays have common un- H Figure 14. Gate array chip floor plan. Figure 15. Layout cell-one element of cell array. derlayers, and only the metal masks are custom-designed. Sharing of common underlayers across many designs makes most of the manufacturing process a high-volume throughput, thus reducing costs. The second method of cost reduction-sharing masks by putting more than one design on a wafer-reduces costs through cost-splitting. Each designer obtains parts at a fraction of the total cost. The fraction is proportional to the number of designs on the wafer. In multiproject chip fabrication cycles, there can be over 100 designs on two wafers.21 Example of the mask-programmable design approach. To illustrate how a design is accomplished, let us examine two design styles: gate-array design and full custom design. Gate arrays are mask-programmable chips, commercially available from several vendors. Each vendor offers some type of aids to assist the designer. Gate-array technology offers several advantages over custom logic designs. First, the underlayers are processed well in advance of the actual requirement, which gives excellent reliability for these factors. Secondly, since only the interconnect is required, the design is conceptually simple and development time can be estimated relatively accurately. A third advantage is the certainty of the cost. Gate arrays can be considered high-volume parts because the underlayers are standard. Only the last few steps are customized, which allows manufacturing costs to be calculated fairly accurately. These costs follow the normal decrease caused by the learning curve. Figure 14 illustrates how a typical bipolar-technologybased gate-array chip is arranged geometrically. The cells containing a uniform pattern ofcomponents (resistors and transistors) are placed in a uniform array between the reference voltage circuits and the I/O buffers that are connected to the pads around the periphery. The components in each cell, shown in Figure 15, can be interconnected to form basic functions, as shown in Figure 16. Each component is constructed on wafers through the various ion implant, epitaxial growth, etching, chemical vapor deposition, and diffusion process steps. The top layers, consisting of metal interconnections, are the final processing steps and can be customized for each application. Gate-array chip design is accomplished easily. Starting with a logic diagram of the function to be implemented, the designer partitions the diagram into islands containing input and output connections equal to or less than the number of pins available on the particular gate-array chip. These subsections should not contain more than 80 to 90 percent of the maximum number of gates available on the gate array. A logic simulation is also run to determine critical paths and find potential race problems. If everything looks satisfactory, the layout of the interconnect can proceed. The designer is provided with a library of function cells, which are basically the interconnect structure of the transistors and resistors that comprise logic gates or devices such as D flip-flops. Using the cells and the logic diagram defining the wire list for interconnection, layout is completed. This layout is constrained by the channel spacing between cells and barriers imposed by power and ground buses, as shown in Figure 17. All gate-array vendors offer COMPUTER Figure 16. Example of the interconnection of layout cell components to create logic blocks. design aids for doing this work. Some offer manual methods that allow the designer to sketch the interconnections on a large drawing of the basic chip underlayers, and others offer computer-aided methods. Using a computerized system, the designer either inputs the logic diagrams or a wire list to the computer system. The computer picks the appropriate cells (gates) from a library, then places and routes the interconnections. After this operation, it reports the status of the design, including any "can't connects." If "can't connects" occur, the designer can either make connection by hand or move the placement of some cells and let the computer try again. One of the more clever approaches to manual aids is based on the use of a large Mylar print of the underlayers or resistor and transistor locations. The design kit contains some sticky-back labels that have the interconnect of logic-gate or function-cell circuits. Guided by his logic diagram, the designer puts down the predefined gates on the Mylar. Then, he interconnects the gates, using techniques similar to the layout of a printed wiring board. The large Mylar master is subsequently used to make the metallization masks for characterizing the gate-array chip. Example of the custom design approach. Custom IC design is the backbone of the industry. All standard products-such as memories, microprocessors, and controllers-are customed designed and directed toward highvolume parts. Another interesting approach to custom design work is being carried on in the university research community for small-volume application. Approximately 20 institutions are teaching design courses and fabricating student projects through the silicon foundry. The approach teaches the basic concepts of NMOS circuits by having the students learn through actual design experience. The procedure entails selecting some interesting function and then trying to design the circuit and generate its layout August 1981 Figure 17. Typical block-to-block interconnection of a gate array chip. 97 before the submittal deadline. Examples of circuits designed and implemented through this procedure are * A geometry engine-special-purpose chips to implement a high-performance graphics terminal-by Jim Clark of Stanford University in Palo Alto, California; * Speech processing chips for speech analysis and synthesis by Dick Lyons of Xerox PARC in Palo Alto, California; * A high-speed multiplier by Steven Danielson of Boeing Aerospace in Seattle, Washington; and * The OINC2, an 8-bit processor, by Dave Johannsen of the California Institute of Technology in Pasadena, California. Over 200 projects have been designed and fabricated during the past year. The custom designer uses the computer-based design tools available at his location. Since this activity is usually tied in with research at various universities, the designer can also be the developer of the tools. The usual method of design is to describe function by using a logic diagram or an electrical circuit. From this representation, he can use either a timing simulator or a circuit simulator to see if the design functions properly. When he feels confident that his design will work, he can then proceed with the layout. There are several techniques used for this procedure, all of which have to produce the layout in a notation called Caltech Intermediate Form, or CIF, as shown in Figure 18. The layout geometry must be described in CIF for standardization of interface to the manufacturer. There is a silicon broker who will accept this form of input to make masks and fabricate and package the chips. Methods used to produce layouts can be described generically as standard-cell piace and route, interactive graphics layout, and artwork embedded language. Standard-cell place and route is based on having a library of cells at the designer's disposal. He then selects the cells to perform the function, places them in an array, and routes the interconnections. An example of standard-cell place and route is the final placement of I/O buffers and pads, a common occurrence at Caltech. The designer usually spends most of his time working on circuit layout, leaving very little time for the design of the I/O buffers and pads. Fortunately, Dave Johannsen has a tool that helps in this situation. This tool is part of Bristle Blocks, a silicon compiler.22 Many different types of I/O circuits and pads are stored in a software library. The designer types in the kinds of I/O pads he needs and the coordinates of the connection points in his circuit. The tool then generates the layouts for the pads and interconnects them all automatically. The power of this approach lies in the fact that the cell layouts are actually computer algorithms. The second design method uses interactive graphics for laying out the circuit. There are several turnkey interactive graphics systems available today, the most notable being the systems offered by Calma, Applicon, and Computer Vision. These systems allow a designer to directly draw the layout patterns on a CRT display, the most common technique used in industry today. On the other hand, universities are primarily research communities and are trying to find the optimum method of using graphics. For this reason, most universities are using their own "homebuilt" graphics systems. Several of these systems include a Sticks editor. In the Sticks method of IC design, the designer can sketch the structure of his circuit and have the Figure 18. CIF design approach for custom ICs. 98 computer fill in the details of the layout. The machine sizes, shapes, and maintains minimum spacing. The Sticks drawing is like a circuit schematic, using lines for interconnect wiring and symbols for transistors. The third method of chip layout generation is to use an embedded language. Graphic primitives such as boxes, polygons, and wires are programmed into the native language. Using this technique, the designer describes his layout in terms of these geometric primitives. He inputs the cells into the system by writing text describing the geometry. The power of this approach lies in the fact that the cell layouts are actually computer algorithms. Consequently, the designer can use the full capability of the comptier to do things like automatic synthesis of layouts for special functions from a set of input parameters. For example, there are programs that generate PLAs directly from the boolean expressions that the designer wishes to implement. The synthesis program can read in these logic equations; determine the number of inputs, outputs, minterms, and location of the PLA personalization transistors; and automatically draw the layout. Several versions of these programs exist at many of the universities. This technique can also lead to the development of other synthesis tools, such as silicon compilers like Bristle Blocks.22 COMPUTER When the designer is satisfied with his layout, he can submit the CIF file to a broker for fabrication. Currently, DARPA and NSF are supporting Information Sciences Institute in Marina Del Rey, California, to provide this service for organizations that have certain DARPA or NSF contracts. This service, called MOSIS, provides request, acknowledge, and progress reporting services via the Arpanet communications network. The procedure for using this service is as follows. The designer wishing to get his circuit fabricated requests the service of MOSIS by sending a request message in a specified format. After MOSIS responds with an acceptance, the designer forwards his CIF file. The MOSIS service then checks his file to determine if the syntax is proper. If it is, the designer receives a message saying that his CIF has been accepted. The message will include the current schedule for fabrication. MOSIS collects CIF files from many designers before it produces a layout for a series of dies. Each die usually holds several different project designs. A wafer mask will have approximately half a dozen different dies containing up to 100 different projects. After the CIF files are composed into a chip layout, a conversion program changes the CIF into a data format for an electron beam mask generator. A mask house is contracted to build this tooling. Then, a fabrication shop or silicon foundry will manufacture the wafers. The MOSIS service includes several standard test circuit structures in the die layout. The fabrication house can use these structures to check their process yield. The use of standard test circuits obviates the need for the fabricator to test the circuits designed by an unknown designer. Thus, the responsibility for testing an individual design can be placed where it belongs-in the hands of the designer. After fabrication, the circuits are visually inspected and the dies are scribed and broken apart. The die is mounted in a standard chip carrier, and the pads are bounded. Then, the chip is returned to the designer with documentation on the pin outs. During the fabrication process, the designer can check with MOSIS via the Arpanet and receive status information on the progress of the lot as well as information about process yield and electrical characteristics. He can use this information to schedule and plan his test. Future projections The future will bring higher density and greater functionality to electronic circuits. Moore3 and Keyes23 have predicted how many circuits per square centimeter will be more easily constructed in the future. Their projections indicate that future building blocks will definitely be more complex and highly functional. Two major problems standing in the path of this progress are the issues of product definition and the increasingly manpower-intensive design process, The product definition crisis3 is related to the need for high volumes of parts. To remain efficient, processing lines must continuously fabricate large numbers of circuits. In addition, the financial return of IC manufacturing is based on the economics of large numbers. Demand increases with low cost, and large volumes of devices August 1981 make it possible to realize low manufacturing costs. For these reasons, the ideal product is one that can be used by a large market. Two types of products fill this description: a single product whose function is required by similar products, such as controllers for appliances or automobile engines; and a general-purpose product that can be tailored by the user, such as microprocessors and memories. For these products, material costs are greater than the one-time costs of design and capital expenditures. What are the important issues already enumerated? * The need for a high volume of parts to effectively utilize the processing line, and * The exponentially increasing complexity which is reflected in increased design costs. The use of standard test circuits obviates the need for the fabricator to test circuits designed by an unknown designer. What are the future scenarios that result? * Continue to produce today's building blocks, but with a size reduction that results in better perform,ance, higher yields, and lower costs; * Produce general-purpose building blocks with more functionality-for example, the equivalent of a mainframe computer on a chip; and * Produce many special-purpose blocks as design talent and the general understanding of VLSI ex- pands. The first scenario will happen almost immediately. The other two will be paced by the speed of CAD/DA improvements. As the trend toward making circuits smaller continues, existing products can be improved by reducing the die size. A good example is Motorola's 6800 microprocessor. In 1972, its die size was 46,000 square mils. Today, its size is 15,000 square mils. The need to define a general-purpose product is a requirement for manufacturers dedicated to producing ''jelly-bean" commodity products. They will need to adopt design strategies that enable them to continue to pace the field. One problem that may lessen the demand for newer processors is the rising cost of software development. As the power of the system increases, the selected applications may become very complex, and the software problem assumes great proportions. The issue then becomes one of the cost of software to specialize the general-purpose system versus the cost of designing special-purpose chips to perform the function. Will something happen in CAD/DA to break the VLSI design barrier? Many approaches will be taken, and some will definitely change designer productivity. One possibility will be the use of standard cell libraries. The standard cell approach has usually fallen short of expectations because of the lag between creating the cell libraries and updating them. The technology of the original cell has usually changed before it is used again. Some progress has 99 been made by the microprocessor people as they move their designs from MOS to the scaled HMOS. The size shrink is done by photo reduction of the masks, and the small circuit size allows more function to be added in the extra space. Examples are the single-chip microcomputers. However, the real breakthrough in the problem will occur when technology-independent methods of representing cells are discovered. This approach would be analogous to the use of intermediate languages in software compilers. Another currently important area of work is in silicon compilers.24 They may be having the greatest effect so far on designer productivity. The fundamental idea of the compiler is to remove the tedious and error-prone tasks of creating a chip design by letting the machine assemble the cells into a chip. A silicon compiler is similar to a software compiler in the sense that they both take a high-level description and convert it to a working entity. Johannsen's compiler22 can build the masks for a bit-slice architecture chip based on input microcode. The underlying mechanism of the compiler involves describing the cells in a computer language (program), allowing a computer to operate on them, and placing them in a chip layout. In summary, for the last two scenarios to be accurate predictions, two types of CAD/DA systems must appear. First will come an integrated hardware-software system dedicated to and developed by the designers of the VLSI era. Such a system will support structured top-down design, providing design management, simulation, test generation, automatic layout, and design consistency Acknowledgment I wish to thank Professors Carver Mead and John Gray of the California Institute of Technology for their encouragement, helpful insight, and inspiration. Special thanks go to Dave Johannsen of Cal Tech, Jeff Sondeen of Hewlett-Packard, and Fred Malver of Honeywell for their constructive criticism and help with the preparation of this article. References E. Braun and S. MacDonald, Revolution in Miniature, Cambridge University Press, London, 1978. 2. "History of the Semiconductor Industry," Circuit News, Newsprint, Inc., Jericho, N.Y., 1979. 1 I I Discover The Dynamic HISTORY OF COMPUTING 3. G. E. Moore, "Are We Really Ready For VLSI?" Proc. CalTech Conf. VLSI, Jan. 1979, pp. 3-14. 4. G. Zimmer, B. Hoefflinger, and J. Schneider, "A Fully Implanted NMOS, CMOS, Bipolar Technology for VLSI of Analog-Digital Signals," IEEE Trans. Electron Devices, Vol. ED-26, No. 4, Apr. 1979, pp. 390-395. 5. H. H. Berger and K. Helwig, "An Investigation of the Intrinsic Delay (Speed Limit) in MTL/12L," IEEE Trans. Electron Devices, Vol. ED-26, No. 4, Apr. 1979, pp. 405-407. 6. S. A. Evans, "Scaling 12L for VLSI," IEEE Trans. Electron Devices, Vol. ED-26, No. 4, Apr. 1979, pp. 396-404. 7. J. Lohstroh, "ISL: A Fast and Dense Low-Power Logic Made in a Standard Schottky Process," IEEE J. SolidState Circuits, Vol. SC-14, No. 3, June 1979, pp. 585-590. 8. A. Bahraman, S. Y. S. Chang, D. E. Romeo, and K. K. Schuegraf, "Schottky-Base 12L: A High-Performance LSI Technology," IEEE J. Solid State Circuits, Vol. SC-14, No. 3, June 1979, pp. 578-584. 9. A. Mohsen, "Device and Circuit Design for VLSI," Proc. CalTech Conf. VLSI, Jan. 1979, pp. 31-54. 10. G. Mhatre, "Micro Peripheral Chips Control FloatingPoint Math Operations," ElectronicEng. Times, Mar. 31, 1980, p. 22. 11. S. Bal, E. Burdick, R. Barth, and D. Bodine, "System Capabilities Get a Boost From a High-Powered Dedicated Slave," Electronic Design, Vol. 28, No. 5, Mar. 1, 1980, pp. 77-82. 12. G. Mhatre, "Telecommunications ICs: Codecs Still in the Lead," Electronic Eng. Times, Mar. 17, 1980, pp. 57-66. Subscribe to the Annals of the History of Computing, a quarterly journal from AFIPS A Press that includes contributions from individuals who participated in or witnessed the events and decisions that have shaped our computing environment. Annals of the History of Computing clo AFIPS Press 1815 North Lynn Street, Suite 800 Arlington, VA 22209 Send me the Annals of the History of Computing. INDIVIDUAL MEMBER U 1 yr. $18 O 2 yrs. S36 O 3 yrs. $54 O Please: send Annals on INDIVIDUAL NON-MEMBER microfic,he 0 1 yr. $25 El 2 yrs. $50 O 3 yrs. $75 O Foreign -Airmail (opCanada INSTITUTION tional) eexcept xico $27.00 U 1 yr. $50 U 2 yrs. $100 O 3 yrs. $150 andateo additior ial I verification. This system will be used principally for VLSI, standard general-purpose parts, and master-slice. The second CAD/DA system will support the end user who will obtain his parts from the silicon foundry. This system will have access to computerized catalogs of standard and programmable IC components. It will also have the capability of transmitting the system design to the IC manufacturer. The system designer could then rapidly develop custom ICs. Continuing VLSI efforts have focused on some of the anticipated problems, but the number and scope of these problems make the past development of LSI seem almost trivial in comparison. The design of VLSI building blocks may prove to be the great challenge of the 1980's. O custom Name Address City State Zip AFIPS Society Member No. CHECK ONE: O Check enclosed payable to AFIPS Press Please bill me ISSN - L 1:U 0164-1239, .4 I COMPUTER 13. M. Grossman, "Dedicated LSI Chips Come Out as Costs Drop," Electronic Design, Vol. 28, No. 4, Feb. 15, 1980, pp. 63-65. 14. P. W. Cook, C. W. Chung, and S. E. Schuster, "A Study in the Use of PLA-Based Macros," IEEE J. Solid-State Circuits, Vol. SC-14, No. 5, Oct. 1979, pp. 833-840. 15. P. W. Cook, S. E. Schuster, J. T. Parrish, V. DiLonardo, and D. R. Freedman, "I,um MOSFET VLSI Technology: Part III, Logic Circuit Design Methodology and Applications," IEEE Trans. Electron Devices, Vol. ED-26, No. 4, Apr. 1979, pp. 333-346. 16. K. W. Lallier and R. K. Jackson, "A New Circuit Placement Program for FET Chips," Proc. 16th Design Automation Conf., June 1979, pp. 109-113.' 17. C. Mead and L. Conway, Introduction to VLSI Systems, Addison-Wesley, Menlo Park, Calif., 1980. 18. C. A. Mead, "VLSI and Technological Innovation," Proc. CalTech Conf. VLSI, Jan. 1979, pp. 15-27. 19. M. J. Foster and H. T. Kung, "The Design of SpecialPurpose VLSI Chips," Computer, Vol. 13, No. 1, Jan. 1980, pp. 26-40. 20. T. R. Blakeslee, Digital Design with Standard MSI and LSI, John Wiley & Sons, New York, 1975. 21. L. Conway, A. Bell, and M. Newell, "MCP 79," Lambda, Second quarter 1980, pp. 10-19. 22. D. Johannsen, "Bristle Blocks: A Silicon Compiler," Proc. 16th Design Automation Conf., June 1979, pp. 310-313.' 23. R. W. Keyes, "The Evolution of Digital Electronics Towards VLSI," IEEE Trans. Electron Devices, Vol. ED-26, No. 4, Apr. 1979, pp. 271-279. 24. J. P. Gray, "Introduction to Silicon Compilation." Proc. 16th Design Automation Conf. June 1979, pp. 305-306.* 'These proceedings are available from the Order Desk, IEEE Computer Society, 10662 Los Vaqueros Circle, Los Alamitos, CA 90720. James R. Tobias is a staff design automation engineer and manager of advanced design automation and systems planning at the Solid-State Electronics Division of Honeywell, Inc. During 1979 and 1980,.he participated in the Silicon Structures Pro- Institute gram at the California Technology tive, as conducting Honeywell's research in of representa- improving _ t computer aids for designing VLSI circuits. Earlier he was the leader of a corporate-level microsystems group whose responsibility was ensuring that Honeywell's divisions used microcomputers effectively. Tobias has written over 25 papers on various aspects of computer-aided design and software development techniques. He holds three patents for energy-saving controls. He is a member of the IEEE, the IEEE Computer Society, ASME, and ASHRAE. He received his BS, MS, and PhD from the University of Minnesota in 1962, 1963, and 1970. Call for Papers Symposium on Architectural Support for Programming Languages and Operating Systems Sponsored by ACM SIGARCH, SIGOPS and SIGPLAN. In cooperation with IFIP WG10.3 March 1-3, 1982. Palo Alto, California Paper submission deadline: 1 September, 1981 The efficient implementation of Programming Languages and Operating Systems depends greatly on the underlying computer hardware. This symposium will focus on the relationship between processor architecture and the implementation of software systems. Computer architects, compiler writers and system programmers are invited to document both past experience and new research into hardware/software tradeoffs. Papers are sought on designs to improve performance, reliable and debuggable systems, instruction set design methodology, pipelining, memory management, and the effect of machine architecture on code generation and optimization. Case studies on how current machines do (and do not) well support software are desired so that mistakes can be better understood and avoided. Measurements, simulations and other methods to directly justify claims are highly encouraged. The symposium will emphasize current programming languages and more conventional machine architectures. A stronger understanding of current systems is desired by fostering a co-operative interaction between programmers, computer architects, and hardware designers. Short research summaries are solicited in addition to full length papers. Authors should submit two copies of a 3000-6000 word paper and a 300 word abstract to the program chairman: David R. Ditzel, Room 2C-523 Bell Laboratories, Computing Science Research Center Murray Hill, New Jersey 07974 Doug Clark, Digital Equipment Corp. Program Committee: Lloyd Dickman, Digital Equipment Corp. Forest Baskett, Stanford Univ./Xerox PARC Dave Ditzel, Bell Laboratories James Bouhana, Wang Institute of Graduate Studies Tom Knight, MIT Jeff Broughton, Lawrence Livermore Labs Glen Myers, Intel Elliot Organick, University of Utah Dave Patterson, U. C. Berkeley Ken Pier, Xerox PARC