PREDICTION OF PARAMETRIC BASED FINAL TEST FALLOUT USING NEIGHBORHOOD ANALYSIS by SUZY M. BROWN, B.S. A THESIS IN ELECTRICAL ENGINEERING Submitted to the Graduate Faculty of Texas Tech University in Partial Fulfillment of the Requirements for the Degree of MASTER OF SCIENCE IN ELECTRICAL ENGINEERING Approved Clfiirperson of the Committee Accepted Dean of the Graduate School May, 2004 ACKNOWLEDGMENTS I would like to take the opportunity to thank the many people involved with the successful preparation and completion of this thesis, and for their support during my academic career. To my academic advisors and professors at TWU and TTU, I am very appreciative for all your help and guidance throughout my imdergraduate and graduate careers. Specifically I would like to thank Dr. Parten, Dr. Cox, Dr. Edwards, and Dr. Thompson. I am gratefully indebted to my industry sponsor, who helped me acquire the skills and material necessary to complete this thesis. Specifically, 1 would like to thank C. Hu, M. Chang, and B. Campbell for supplying the assistance and tools necessary to get my work done. I would like to thank the group entirely for helping me to have a successful internship. Also, I would like to thank my industry sponsor for the financial support necessary to complete my degree. I want to thank my family andfiiendsfor their support, both emotionally and financially. Thank you for yoiu- understanding, patience, and guidance. 11 TABLE OF CONTENTS ACKNOWLEDGMENTS ii ABSTRACT v LIST OF FIGURES vi CHAPTER I. II. III. INTRODUCTION 1 The Need for Reliable Semiconductor Products 1 Outline of Chapters 4 TESTING MICROPROCESSORS 5 Device Fabrication 5 Defect Mechanisms and Types 5 The Need for Test 11 Testing 12 Components of Testing 17 Testing Global and Local Defects 18 Test Considerations 21 DEVELOPMENT OF METHODS FOR NEIGHBORHOOD ANALYSIS 23 Burn-In Reduction 23 Industry Evaluations and Methods for Bum-In Reduction 24 Neighborhood Analysis Methods 25 Deriving Yield Numbers 30 111 IV. V. NEIGHBORHOOD ANALYSIS RESULTS 43 Introduction 43 Neighborhood Analysis for Local Yield 45 Neighborhood Analysis Using Averaging Methods 53 Neighborhood Analysis Using the Geometric Mean 57 Large Die Size Consideration 61 NEIGHBORHOOD ANALYSIS CONCLUSIONS 64 REFERENCES 67 IV ABSTRACT It is desired to decrease time and money devoted to bum-in and to eliminate unnecessary processing of defective integrated circuits. Bum-in is a reliability screen used to isolate poorly performing integrated circuits before they complete testing and are shipped. A method is used to attempt to predict good and failing semiconductor devices at final test at an earlier stage in fabrication. Devices with a passing prediction may qualify for reduced bum-in. This method, neighborhood analysis, uses neighboring die on the semiconductor wafer on which it was made in order to predict whether it will pass or fail at final test, before it reaches bum-in. A correlation is investigated of final test failures against good units to identify whether or not a strong enough trend supports early scrapping of material, or a possible bum-in specification modification. LIST OF FIGURES 2.1 Test Flow for IC 13 2.2 Bathtub Curve 16 2.3 Three-Ring Oscillator 20 3.1 Neighbor Definitions 29 3.2 Methodology for Neighborhood Analysis 30 3.3 Height and Width Measurements for a Die 32 3.4 Orientation of Neighboring Die to Reference Die 32 3.5 Orientation and Values for Normalized Example 35 3.6 Reference Die with Twenty-Four Nearest Neighbors 37 3.7 Radial Distances (mm) and Groups 40 3.8 Edge Die Consideration 42 4.1 Charts with Resulting Data Shown in Chapter IV 43 4.2 Percent of Failures versus Good out of Total Die at Each Predicted Yield Group, 8 Nearest Neighbors (Missing Edge Die = 0) at Laser 46 Percent of Failures versus Good out of Total Die at Each Predicted Yield Group, 8 Nearest Neighbors (Missing Edge Die = 0) at Multiprobe 47 Percent of Failures versus Good out of Total Die at Each Predicted Yield Group, 8 Nearest Neighbors (Missing Edge Die Omitted) at Laser 48 Percent of Failures versus Good out of Total Die at Each Predicted Yield Group, 8 Nearest Neighbors (Missing Edge Die Omitted) at Multiprobe 49 Percent of Failures versus Good out of Total Die at Each Predicted Yield Group, 24 Nearest Neighbors (Missing Edge Die = 0) at Laser 50 4.3 4.4 4.5 4.6 VI 4.7 4.8 4.9 4.10 4.11 4.12 4.13 4.14 4.15 4.16 4.17 Percent of Failures versus Good out of Total Die at Each Predicted Yield Group, 24 Nearest Neighbors (Missing Edge Die = 0) at Multiprobe 51 Percent of Failures versus Good out of Total Die at Each Predicted Yield Group, 24 Nearest Neighbors (Missing Edge Die Omitted) at Laser 52 Percent of Failures versus Good out of Total Die at Each Predicted Yield Group, 24 Nearest Neighbors (Missing Edge Die Omitted) at Multiprobe 53 Percent of Failures versus Good out of Total Die at Each Predicted Yield Group, XY Yield at Laser 54 Percent of Failures versus Good out of Total Die at Each Predicted Yield Group, XY Yield at Multiprobe 55 Percent of Failures versus Good out of Total Die at Each Predicted Yield Group, Radial at Laser 56 Percent of Failures versus Good out of Total Die at Each Predicted Yield Group, Radial at Multiprobe 57 Percent of Failures versus Good out of Total Die at Each Predicted Yield Group, ULP Y (XY Yield and 8 Nearest Neighbors) at Laser 58 Percent of Failures versus Good out of Total Die at Each Predicted Yield Group, ULPY (XY Yield and 8 Nearest Neighbors) at Multiprobe 59 Percent of Failures versus Good out of Total Die at Each Predicted Yield Group, ULPY (XY Yield and 24 Nearest Neighbors) at Laser 60 Percent of Failures versus Good out of Total Die at Each Predicted Yield Group, ULPY (XY Yield and 24 Nearest Neighbors) at Multiprobe 61 vn CHAPTER I INTRODUCTION The Need for Reliable Semiconductor Products The semiconductor industry is a fast moving business that is impacting the world in diverse and ever-changing ways. From highly innovative communication tools to faster and extremely efficient computers, the growing industry faces new challenges every day. The information age is possible due to the invention by Jack Kilby in 1959 of the integrated circuit [7]. With this technical conception, circuits today contain millions of transistors to power the high-speed computers and large-capacity memories of modem technology. Highly competitive companies push to get new products and solutions shipped to customers first into the market and work to build recognition as producers of efficient and reliable products. In order to be leaders ofthe market and maximize profit, companies must build a positive reputation with customers. This means if a company sells many failing devices, or devices that fail too quickly in time, it will not remain in business for very long. The process of building integrated circuits must be robust and the time to market must be quick, all the while maintaining quality. In order to maintain quality and reliabilify, the need for test emerges. Before any product can be sold, it must be tested with a precise and vigorous method to ensure it will meet the customer's needs. Each device begins testing at an early stage, when only a few layers ofthe product have been built. This testing continues while the devices or die are in wafer form, which is where each die remains uncut on a wafer. A wafer is a thin slice of a semiconductor substrate where fabrication of an integrated circuit is performed. Wafer level tests consist of parametric laser probe and multiprobe. The testing process continues after the device is built and packaged, which is calledfinaltest. Testing provides valuable information about each device and proves whether the device is good or if inconsistencies or problems within the process exist causing the product to fail. Testing can also find process marginalities that might cause unreliability of devices in time or might indicate where improvement is needed in fabrication. Obviously the need for testing is imperative and requires constant innovation and test program maintenance in order to continue selling reliable products. One way to ensure reliability is to bum-in the device while atfinaltest. Bum-in is where a device is subjected to higher than normal voltages and temperatures for a predetermined amount of time in order to speed up the early failure rate period, before the part is shipped to the customer. After completion of bum-in, failures should be significantly reduced if not eliminated completely. If such failures take place, this represents a reliability issue within the process or design. Since bum-in represents the early failure rate period of a particular device, the devices would not be suitable for shipment without it, unless it is proven the devices do not require it. Bum-in is an expensive and time-consuming measure. There is a constant desire within the semiconductor industry to decrease time at bum-in or delete bum-in altogether in order to maximize profit by cutting costs. Test reduction methods are of high importance within any semiconductor company. To cut costs anywhere in the fabrication process, packaging, or testing is desirable in order to maximize profit. As previously mentioned, one such way to do this is by reducing or deleting bum-in. Several approaches can be used in order to accomplish bum-in reduction. One such approach, as outlined in this thesis, is neighborhood analysis of devices at wafer level. Current methods of sorting at the wafer level include scrapping faulty wafers based on overall wafer yield and continuing to process good wafers. Using this method, a wafer is divided into smaller sections called zones. The yield ofthe zones on the wafer will determine if the wafer is passed or scrapped. If a high percentage ofthe zones yield poorly, the whole wafer is scrapped due to reliability concems. This percentage is predetermined by the engineers working on the device. The idea is that if too many zones on a wafer are low yielding, the entire wafer would be suspect and it would be best to discard, or scrap, the entire wafer. If a high percentage ofthe zones yield well, the wafer is processed minus the failing die. This thesis considers die level sorting, where only individual die are scrapped based on their performance and performance of their neighbors, rather than entire wafers. Neighbors are those die on the wafer that due to their proximity or relationship to a reference die may affect its yield. A reference die is a device on a wafer that will be assigned a predicted yield. Die level sorting may eliminate scrapping low yielding wafers that may have high yielding regions with good die that might be imaffected by the low yielding portion. By using die level sorting and isolating die with a high predicted yield based on local yield and other methods, bum-in can possibly be reduced for those die. Local yield is the yield of a die based on the performance of immediately neighboring die at wafer level tests. There are other methods used in this project to be mentioned later in order to determine predicted yields. Those die with high local yield may be considered for reduced bum-in due to increased confidence that they will pass at final test based on the yield of their neighbors. Bum-in may be completed at full duration for those die with poor local yield. Reference die with low predicted yields may be considered for scrapping before they are processed further and packaged. Outline of Chapters Chapter II presents more information about testing semiconductors and the importance of bum-in reduction. Chapter III describes the method used by this thesis in order to determine if bum-in reduction is possible utilizing neighborhood analysis. Chapter IV shows results after carrying out the method described in Chapter III. Chapter V concludes and discusses whether predictingfinaltest yield based on neighborhood analysis of a reference die at wafer level test will help in imderstanding if bum-in can be reduced. CHAPTER n TESTING MICROPROCESSORS Device Fabrication Semiconductor devices are built as integrated circuits, or ICs. ICs are microelectronic circuits that are incorporated into a chip of a semiconductor, which is usually silicon. These circuits are comprised of many interconnected transistors and other components that operate as a whole system [12]. The fabrication of an IC involves several photolithographic printing steps, etching, and doping. A comprehensive introduction of device fabrication can be found in Microchip Manufacturing, by S. Wolf [2]. Defect Mechanisms and Types IC stractures are prone to a variety of nonideal physical characteristics that are not always under the manufacturer's control. These defect mechanisms can be organized into different origins ofthe defect. There are wafer defects, human errors, equipment failure, environmental impact, and process instabilities [5]. Wafer defects occur within the bulk material that forms the basis for semiconductor devices, which is usually silicon. Silicon can become contaminated easily, and can form micro-cracks. These problems can cause a shift in parametric properties which can affect the performance ofthe IC or the elements that are arranged in the area that is infected [5]. Human errors include pollution in the air (e.g., skin cells), scratches on the wafer due to careless handling, and process steps that have been forgotten or repeated if a step was not noted thefirsttime it was taken. Every possible effort can be made to automate any steps that can be computer controlled, or otherwise handled without human intervention [5]. This can eliminate or cut down on human mistakes, which cannot be prevented otherwise. Equipment failure includes air contamination ofthe equipment, and equipment which has not been tuned properly. The un-tuned equipment can lead to mishandling of wafers, which can lead to mass failures. The solution to this problem would be strict maintenance planning, where the equipment is routinely checked and calibrated [5]. Environmental impact includes the air inside the fabrication facility. If the air contains contaminants that are the same size or bigger than the minimimi feature size of the current technology being processed, and any of these particles land on the wafer surface, this can lead to functional problems [5]. This is why clean rooms in fabrication facilities are kept very clean and monitored closely. The degree of cleemliness is specified as how many particles are allowed to exist per unit volume of clean room air. A Class-X clean room contains less than X total particles greater than 0.5 [im in size per cubic foot of air. So, a Class-1 clean room is allowed to have one particle greater than 0.5 |j,m in size per cubic foot of air [2]. Clean room protocols are designed to decrease the number of particle defects on a wafer. These include gowning of human workers with a body suit, gloves, hair nets, inner caps, hood, and booties to insure as much cleanliness as possible. Other protocols include refi-ainingfi-omsudden movement, running, or jumping, the absence of chairs which can collect particles, and the use offiber-lesspaper. These protocols have been developed to keep particlesfi-ommoving through the air and to prevent them fi'om contaminating wafers [2]. Process instabilities are those defects that are caused by certain process steps. Some ofthe steps applied are critical and susceptible to variations in process conditions, which can lead to several various process instabilities. Process instabilities can include problems with doping profiles, photolithographic printing, etching, wafer siuface inconsistencies, metal lines, and chemical vapor deposition (CVD). Doping profiles using N and P regions are not readily distinguishable using characterization tools. Doping can serve several purposes, mainly being the formation of the source and drain regions for the transistors. These regions serve as the paths of current flow in the silicon through the metal interconnect lines and the channels ofthe transistors [2], There are two types of dopants, N and P types. Current flow is possible through doping due to the nature ofthe materials. A full account of how N and P type dopants are used in an IC can be found in Microchip Manufacturing, by S. Wolf [2]. After the process steps are taken in order to introduce dopants into the silicon, there is no way to determine exactly where the dopant profiles exist. Without knowing where the profile boundaries exist, the process engineer cannot guarantee an exact process. However, doping profiles do have an effect on device characteristics. Doping errors can cause large DC offsets, deformations, etc that can cause performance problems [1]. The photolithographic printing process is another fiagile process and is always subject to imperfections. It can cause failures or inconsistencies in performance from one IC to the next ofthe same process. ICs with extremely small features are very sensitive to any changes or inconsistencies in the process [5]. A misaligned photolithography mask can cause partial defects, and cause high contact resistance. This can lead to minor dc offset problems and catastrophic distortion problems. A misaligned mask can also lead to completely defective vias, which can create a completely open circuit [1]. Fabrication problems can arise during the etching processes of a device. Etching usually occurs after photolithography in order to create a permanent pattem on the wafer, creating an exact transfer ofthe image present in the resist layer. If a wafer is under etched, it can create defective meteil contacts and vias. It can also lead to catastrophic shorts between circuit nodes [1]. Surface defects can arise due to mishandling of wafers due to human or equipment error or particles in the air. Particles can arise on the surface ofthe wafer or on a photolithography mask. This can lead to short circuits between nodes or other catastrophic consequences. Surface defects also include scratches, broken bond wires, and surface explosions caused by electrostatic discharge (ESD) in a mishandled device [1]. Surface defects can be avoided by taking special precautions during the fabrication of devices. Metal lines can cause process issues. The process necessary to form metal lines can create imperfect, not rounded lines. These imperfect lines can lead to parasitic capacitance between traces and surrounding elements, catastrophic shorts, and catastrophic opens. As the technologies of today are become increasingly scaled, this problem can become more apparent. Imperfections in metal and performance sensitivities only become more exaggerated as device geometries shrink in size [1]. Another important process to consider is chemical vapor deposition, or CVD. During a CVD process, a material is locally deposited on a wafer to complete a particular step. Locally deposited materials and variations in layer thicknesses depend mostly on the constant flow ofthe gas used. If obstacles in the gasflowor around the injection point are present, a turbulent flow can occur. This can result in major or minor thickness variations. Variations in thickness no matter the degree can result in parametric differences across the wafer [5]. It is important to keep properties constant across wafers within a single lot and across several lots. There are two classes of defect mechanisms which are worth noting. They are global and local defects. Global defects are those defect types that affect a large area, or possibly an entire wafer. The wafer is influenced in the same way in the affected area. Local defects affect a smaller portion ofthe wafer. It is important to note, however, that there are other regional classifications when referring to wafers. Zonal areas are subsections of a global area, or smaller portions ofthe global area. A neighborhood is an area smaller than a zone. Finally, a local area is smaller than a neighborhood. An example of a global defect is mask misalignment. A misaligned mask will be misaligned everywhere on the wafer, and can cause different parametric behaviors. Another example is line registration errors, where due to changes in etching times lines are too wide or too narrow. This will also affect the entire wafer, as etching profiles are not achieved locally, but for the entire wafer. Also, different implantation levels are considered global defects. Different levels of implantation can cause a shift in transistor parameters between wafers in a lot, and between lots within a revision of silicon [5]. Local defects include those defects that impact a smaller area on a wafer. In a photolithographic process, dust particles or pollution by chemicals on a wafer or mask can affect a smaller portion of a wafer and cause local fallout. Scratches and cracks due to human error or mishandling by robotic machinery due to un-tuned equipment can also be classified as local defects as they do not affect the entire wafer's performance [5]. There are two types of faults to consider when looking at either global or local defects caused by any ofthe above-mentioned possible causes of defects. These faults are parametric and functional [5]. Parametric faults are determined by in-line parametric tests performed after the first level of metallization, and again at final parametric test. These tests are in place to eliminate further processing of wafers that fail in-line parametric tests after metal layers are deposited. Scrapping of failing wafers at this point is justified because the chance offindingpassing partsfi'omwafers that fail these tests is unlikely or not existent. Parametric data can also provide warnings about fabrication problems. The data obtained can indicate how the wafers were processed and where problems may exist within the process flow. Correlations can be developed between the different parameters and thefinalproduct specifications. This can aid in optimization of fabrication conditions [2]. A parametric fault could be reduced threshold voltage or increased resistances of coimections. Parametric faults can be caused by either global or local defects. Most 10 global defects result in parametric faults. Specific parametric tests are necessary because the kind of faults that cause parametric failures may cause a device to fail only performance related specifications that are not always measured [5]. Functional faults are catastrophic failures in the behavior ofthe device. These faults can varyfi-oma logic failure of an output under a specified limited input condition to complete incorrect operation ofthe device regardless of signals applied. For example, an output connected through a short on a supply line does not change regardless ofthe input applied. This would indicate complete failure ofthe device [5]. The Need for Test The problems that arise within the process of semiconductor devices can lead to several types of defects. Any ofthe above mentioned problems can occur, as well as several other process complications at any step in fabrication of devices. The best way to monitor and alleviate process tweaks is to carefully test each product and make sure they meet product specifications. These defects must be detected through test before the devices are considered of good quality and can be shipped to the customer. Test points are intact while the devices remain in wafer form and remain until they are shippable packaged die. The final test points are intact to insure nothing has changed after the devices have been packaged. There exists a long feedback loop for semiconductor devices. The entire processing required from silicon wafers to packaged integrated circuits is between about four to six weeks. Problems that occur early in the process could be detected an entire 11 montii later [5]. Each IC involved in a problematic lot could be affected by the same circumstance, and all could be lost. Early fault detection is necessary in order to alleviate this sort of situation. Problems detected early can be solved and used to identify process control decisions. If process problems are detected early enough, failing devices can be scrapped, thus saving the time and money of processing these die any fiirther. Also, identifying those units that are good and that may be eligible for reduced bum-in at final test may help in planning for bum-in oven capacity at a later time. A steady process is important and any variations can be observed through testing the product and reporting any yield effects. Through careful monitoring and yield observation, the process and test flows can be maintained properly and fine-tuned. Testing The testingflowused on the material utilized in this thesis is identified from the first test point until the very last. There are two main categories of test, where the ICs are in wafer form and as packaged devices. Two tests known as laser and multiprobe occur while the devices remain in wafer form. At final test, the packaged units undergo pre bum-in, bum-in, and post bum-in. Post bum-in includes two tests at different temperatures and a quality assurance test [7]. The testflowfor an IC is shown in Figure 2.1. At laser test, only a few layers of metal are present on the wafer. Die are tested with SRAM tests, continuity, and shorts tests and are binned accordingly. SRAM is static random access memory, and includes a large number of bits that can either function 12 properly or fail. Good die are labeled for continued processing, as are repairable die. Repairable die include those die that contain failing bits of memory and can be repaired by a procedure known as laser repair [7]. The test that brings interest for this project is done before repair. So only those die that are repairable or good are considered. Packaged die-level tests: Pre bum-in Bum-in Post bum-in I Figure 2.1. Test Flow for IC. After mending repairable die, the wafers are completed with all layers of metallization and only have to be sawed and packaged before they undergo final test. At this point, all good die on the wafer including those die that were able to be repaired are prepared for multiprobe. A wafer is placed in a holder under a microscope and aligned for testing by a multiple-point probe, or multiprobe. The prober makes contact with the 13 die by way of various pads on the surface. The electrical properties ofthe device are now observed by a series of tests. The tests are done automatically and take somewhere between milliseconds to several seconds depending on the size and complexity ofthe device. The results takenfromthese tests are then compared with information stored in the computer that is based on the specification ofthe device. The computer can then "remember" whether the chip passes or fails each particular test. A failing device would be a unit that falls below the specifications determined by the computer. The probe steps to the next device and completes testing in the same manner for the entire wafer. The wafer is then removedfromthe multiprobe tester and the individual devices are sawn apart along the scribe lines ofthe wafer, separating the units. Each passing die is then picked up and prepared for packaging, while the failing die are scrapped, or discarded. Information gainedfromtesting regarding each die is stored for further analysis of failures and passing devices. This stored information can be used for failure analysis and possible process changes as needed [13]. After multiprobe, the devices are packaged and prepared for final test according to the specification for the type of application. Final test log points are intact to guarantee that the performance ofthe device did not shift during the packaging process [1]. Final test includes pre bum-in, bum-in, and post bum-in [7]. During pre bum-in, all units are tested as to continuity and functionality. This is thefirsttime for the devices to see a major temperature increase as well as the first exposure to certain test pattems. The devices are voltage-stressed as well to ensure stability. Semiconductor devices are extremely sensitive to temperature and voltage 14 changes, and the process must be robust enough in order to witiistand varied temperature and voltage operation in the end application [7]. Good devicesfrompre bum-in are sent to bum-in. Bum-in is where a group of devices are subjected to high temperatures and voltage stress for a predetermined amount of time. The time is determined by an early failure rate (EFR) experiment done every time a new design or process revision takes place. During an EFR, devices are loaded into bum-in ovens and taken out intermittently and tested. The amount of time necessary for failures to cease to occur is considered to be the bum-in time. Semiconductor devices can be in any one of three stages at different phases of their lives. These phases are the initial failure period, the random failure period, and the wear-out failure period, as shown in Figure 2.2. Bum-in is intended to speed up the initial failure period, or infant mortality, so that failures cease to exist and will not be sent to the customer [7]. This is based on the fact that failures that occur after shipping are often the result of processing defects that degrade and eventually fail due to temperature or voltage stress, or a combination ofthe two. Bum-in is in place in an attempt to catch these early failures before they are shipped. The stress put on a device causes any defect to accelerate so it can be found during a post-bum-in test. This measure will reduce the time to market, increase profit, and reduce customer retums [8]. After bum-in, there should be no more failing devices. This is based on the idea that after the initial failure period, there should only be random failing devices where the failure rate curveflattensout. Any failures at this point are considered reliabihty issues 15 and a concem for the process or design ofthe device, since the devices' infant mortality period has already been simulated through bum-in. m Random Failure Penod -^^—Wear-out failure period Failure Rate > Time(t) Bathtub Curve (Failure Rate Cunre) Figure 2.2. Bathtub Curve. After the random failure period, the wear-out failure period occurs. During this time, semiconductor devices see the end of their Ufetime and eventually wear out. This is due to continued use over time with varying temperature changes [7]. According to recent reliability engineering research, however, failures due to wear-out can possibly be eliminated over a product's anticipated lifetime. So the majority of failures occur during early life, or within the infant mortality phase [3,4]. The lifetime of a semiconductor device can be related to the lifetime of humans. There is an infant mortality rate, where 16 babies are more delicate and more susceptible to sudden death or disease. During one's lifetime there is a lower chance of mortality due to more consistent health and less fi-agility. At the end of one's lifetime, a human will grow old and once again be weaker and more susceptible to disease and deterioration. After bum-in, the ICs are tested to make sure all the failures have occurred and the devices aretiiilyintiierandom failure rate period. This is done at post bum-in, at both high and low temperatures in order to investigate temperature stability. Any failures here indicate a reliability issue since they have already gone through a predetermined amount of bum-in which should signal they are out ofthe infant mortality phase. At post-bum-in, devices are also speed sorted, where units are grouped according to their performance speed. This is done to ensure devices are sorted within the correct speed range as specified by the customer. Then, a sample of material goes through a quality assurance test under relaxed conditions. This is to make sure the product has maintained quality throughout the test process and to assure the customer is receiving quality parts in line with specification [7]. Components of Testing Eachflowof testing has certain components necessary in order to get products tested efficiently while saving time and maximizing throughput. The components consisted in an ATE tester are the workstation, the mainframe, and the test head. These components will be described briefly [1]. 17 The test head contains the most sensitive elecfronics. It contains the circuits that require the closest proximity to the device under test, or DUT. The test head contains the device interface board, or DIB, which forms tiie electiical interface between tiie automatic test equipment, ATE, and the DUT. The DIB is custom to the particular device, and provides a temporary socketed electrical cormection necessary in order to test the part [1]. The workstation is the user interface to the tester. From here, the test engineer can debug test programs using the software provided from the ATE vendor. The workstation is very user friendly, and production persormel can contiol day to day operation ofthe tester as it tests devices under mass production [1]. The last component of testing is the mainframe. The mainframe contains the power supplies, measurement instruments, and one or more computers that control the instruments as the test program is executed. The mainframe may contain a manipulator to position the test head properly in order to calibrate its settings. It may also contain a refrigeration unit which will provide cooled liquid in order to regulate the temperature of the test head elecfronics [1]. Testing Global and Local Defects Current test methods for ICs can be divided into two main categories, to detect global or local defects. Several different stmctures are used in order to detect these types of defect. These can include process confrol monitors (PCMs), parameter monitoring. 18 and ring oscillators to test global defects, and in-line parametric monitoring and gateoxide monitors to detect local defects. Two methods to detect global defects utilize process confrol monitors, or PCMs, and parameter monitoring. PCMs are specifically designed test modulestiiatare used to measure low-level parameters. They consist of very basic electrical stmctures such as single transistors, single lines of conducting material, and chains of via contacts. These stmctures can be placed on the chip or within the scribe lines, which are located between the chips where they are sawed apart. The process quality can be checked at some stages throughout the process by carrying out in-line measurements on these devices. These devices can contain a set of pattems that are representative ofthe stractures in the actual product and can be used to emulate the performance ofthe circuit [5]. Parameter monitoring is used tofranslatelow level parameters into higher level behavior that can be monitored throughout the process. A performance related test of a parameter monitor design can show the presence of some parametric values. Most global defects cause a parametric failure that can be detected in this way. The ring oscillator is a type of circuit that can be used for this type of monitoring [5]. A ring oscillator consists of an odd number of invertors placed in series whose output is connected back through the input. A simple three-stage ring oscillator is shown in Figure 2.3. The resulting oscillationfrequencyfromthe ring oscillator depends on the parametric characteristics of all the components in the circuit, which makes it a good candidate for emulating circuit performance [5]. 19 Figure 2.3. Three-Ring Oscillator. Local defects can be measured with a variety of methods. Local defects can include particles on the top of wafers that lead to problems with processing and electrical behavior of a product. Local defects can also be scratches or other inconsistencies which affect a portion of a wafer. Defects in the gate oxide and interconnect layers form the vast majority of all defects. These defects can be assessed by use of in-line and gateoxide monitoring [5]. In-line monitoring serves as one ofthe most important techniques in obtaining high production yield and good product quality. During various stages ofthe process, after known sensitive and critical process steps are taken, the outcome is inspected. Two techniques used for this inspection are surfscan and image evaluation. Surfscan is a technique in which a bundle of light is applied and the reflections are evaluated in order to determine the number of particles on the surface of a wafer. This technique is mainly used for inspection after layer deposition equipment is used. While surfscan is ideal for unpattemed wafers, image evaluation uses manual or automated image inspection systems in order to check the occurrence of local defects on pattemed wafers. By 20 applying this method at critical points in the process, the current status of part ofthe processing line can be monitored effectively. For example,tiiistechnique would be usefiil after polysilicon crystalline layers are deposited and after deposition of each metal layer [5]. Gate oxide monitors are used in order to investigate the gate oxide integrity and performance. The formation ofthe gate oxide layer is a very critical and sensitive step, and is susceptible to contamination and process disturbances. The thickness of this layer is the smallest dimension in the complete process for CMOS technologies. If shorts exist between the chaimel and the gate ofthefransistorthrough the gate oxide, parametric and functional faults may result. Simple test stractures which form the gate oxide monitors can be used to detect contamination problems of this sort [5]. Test Considerations In order to fabricate, package, test, and ship parts, and collect revenuefromthe whole process, costs must be observed in each area. One way to cut down on costs and increase profitability is to observe test economics. Profitability can be defined as the difference between the revenues generated by a company's products and the costs associated with developing, manufacturing, and selling them. One way to increase profitability is to decrease thetimeto market. A delay in the time to market will usually result in a substantially lower profit margin over the product's shortened life span. A delay may also resuh in lost business if a competitor's solution is designed into the customer's system [1]. 21 Anotiier way to monitor costs is to develop an effective yield strategy. Yield is defined as the ratio of good chips per wafer to the total number of chips per wafer. A good chip is one that has passed all the parametric and functional tests that are specified for a product [2]. Test reduction is a very important area to consider when looking to reduce cost. Testing of devices cannot be eliminated completely as quality must be assessed before shipment to the customer. However, it can be noted that once a device stabilizes into production, test reduction methods can be experimented with and possibly used to reduce test. It would be desirable to implement reliable and efficient testing, while utilizing a method that reduces or eliminates certain aspects of testing. The next chapter will explore the possibility of reducing bum-in, a very expensive part of final test. 22 CHAPTER III DEVELOPMENT OF METHODS FOR NEIGHBORHOOD ANALYSIS Bum-In Reduction Test is a necessary measure in order to reduce customer retums, increase revenue, and to ensure that products shipped to the customers meet specification. Reduction of test is desired, however, in order to maximize profit by decreasing cost and time to market. Bum-in is an effective tool in screening out devices that have low reliability. It is an area that is sought-after to decrease or eliminate, however, because it is expensive due to the high cost of equipment and labor required to achieve results. Time is another factor, because time to market is delayed when all devices require bum-in. Bum-in is also a destractive test, where failing devices represent revenues that have been lost [8]. One approach currently used in the semiconductor industry to reduce bum-in is to use an early failure rate, or EFR. When a new device is ramping to be released into production, has a design revision, or has a fabrication process revision, a sample goes through an EFR. This involves defining a sample of devices, loading them into bum-in and buming them in for a maximum amount of time at a specified voltage. The maximum time and voltage settings are determined by the specification ofthe device. During the maximum amount of time, the devices are unloaded and tested at high and low temperatures intermittently throughout the bum-in process, possibly three or four times. During the high and low temperature testing, there will probably be some failing devices. Semiconductor ICs are sensitive to the high temperatures and increased voltage 23 settings ofthe bum-in ovens, and should reach infant mortality quicker using this method. The devices will reach a point where they are in the random failure phase and failures should decrease and level out. The time it takes to reach this point is recorded and used for bum-in once the devices reach production. Products sold to the customer should be in the random failure phase at this point, so there should be no additional infant mortalities. EFR studies are used to determine bum-in for ramping devices, and devices that are in production. This can be a good choice for bum-in reduction as products mature, due to the resolution of design and process marginalities and problems. Also, as test program changes are made and devices are debugged, products tend to become better quality and have higher yield, thus decreasing infant mortality. As EFR studies continue, bum-in time may see a decrease [7]. Industry Evaluations and Methods for Bum-In Reduction Other methods have been implemented in order to reduce or eliminate bum-in. In 1999, Intel investigated the results of multiple correlations between reliability and yield on a die level basis. This work utilized a microprocessor with 0.25 \im technology and a five metal layer CMOS logic process. A one million unit sample size was used. It was found that reliability defect density was proportional to yield defect density [10]. In 2001, Intel explored the optimal methods of measuring sort defect density at the die level. This study used 80 million units with 0.18 |im technology and a 6-layer CMOS logic process. Using unit level predicted yield (ULPY), the company found a strong correlation to bum-in failures [9]. 24 A master's thesis written by K. Black looked at die sorting in conjunction with parametrics in order to achieve lower bum-in. The parametrics used were I-Drive, gate oxide integrity (GOI), and IDDQ characteristics. If the gate width-to-length ratio decreases, the I-drive measurement decreases as does the speed ofthe device. This indicates an unstable fabrication process. Low GOI measurements indicate high oxide breakdown time when a high voltage is applied through the gate oxide. This indicates wafer defects such as oxide impurities. Finally, high IDDQ generally means a short exists in the interior ofthe device. These measurements are taken by applying a voltage and measuring the subsequent current. Since parametrics can be used to discover unstable devices, they can be used in conjunction with die level sorting based on neighboring die to reduce bum-in. A die level sorting algorithm was developed for such devices [6]. Neighborhood Analysis Methods The application of die level sorting utilizing neighboring diefromthe wafer level is desirable on wafers with large die. A microprocessor with a larger area than studied previously was used in this study to investigate the possibility of die level sortmg based on the yield of neighboring die. Die level sorting is where only certain die are scrapped and good die are processed based on their functionality at wafer level tests. This is in confrast to wafer level sorting, where poor yielding wafers are scrapped due to a majority of zonal failures. Wafers are divided into a set number of zones and scrapping is based on the yield of these zones. If a large number of zones have poor yield, the entire wafer is scrapped. 25 High yielding wafers are tested and bumed-in. Scrapping entire wafers based on zones can be costly since good die are on those wafers are thrown away as well. Die level sorting can save money when good die are processed with confidence that they will retain reliability, even if there are zones on the wafer they came from which had poor yield. It may be possible to determinefinaltest yield on a larger device based on neighboring die yield at an earlier test point, while the die are still in wafer form. It may also be feasible to downgrade devices based on neighboring die yield and require more bum-in than higher yielding die. Byfrackingdata throughout the fabrication and testing process, possible frends can be found in order to support die level sorting. Local yield can be investigated on a wafer and has been shown to predict passing and failing regions on a wafer [8]. By observing local yield on a wafer, individual die can be considered for scrapping, continued processing, as well as reduced bum-in. It would be desirable to explore a method that will support a scrapping specification and/or bum-in reduction based on confidence that such die would fail or passfinaltest. The purpose of this project was to be able to predict final test fallout of fransistor related failures by understanding if any correlation exists between neighboring die and reference die at laser and multiprobe. It was also necessary to observe all failures as well rather than primarilyfransistorrelated failures. Empirical data was used to determine if continued processing potentially good die and scrapping poor die at the wafer level was in order. Also, die with a high predicted yield was considered for possible decreased bum-in. 26 Several steps were taken in order to determine if these goals were possible. To drive this experiment, a large, seven metal layer microprocessor device was used. Twenty lots comprised of twenty-four wafers each were used. All failuresfromfinaltest were isolated as well as the transistor related failures of interest. These transistor related failures were L2Cache (level two cache), high Vdd SRAM (static random access memory) failures, and high Vdd functional failures. L2Cache failures are those devices that have failing bits of memory caused byfransistorfabrication issues. High Vdd SRAM failures are memory failures that fallout due to high Vdd. A die that falls out ftmctionally due to high Vdd is biimed out as a high Vdd functional failure. These failures represent a design or process marginality of interest [7]. All types of failuresfromfinaltest were included as another mode of data collection. Including all types of failuresfromfinaltest compared with all good units helps to prevent biasfrombeing infroduced. This bias is due to the nature ofthe test programs. In a test program, the shortest and less complicated tests are at the beginning, and the longer ones are located at the end ofthe program. This is done because once a part fails, the test program is exited for that part. The tester then moves on to the next part, and starts over. This is done to expedite the testing process so that if a part fails early on, it has only gone through short tests rather than longer ones. So, if a neighbor of a transistor related failure falls out ofthe test program at multiprobe for instance, at the beginning, there is no way to tell whether it would have passed or failed the fransistor related failure bins of interest. It would be counted as a failure, but it may have actually passed the appropriate test if tested with it. Therefore, the data would be biased at that 27 point. For this project, both transistor related failures and all failures were observed to understand their comparison. After the failing reference die were located from fmal test, data was collected from their neighboring die at laser and multiprobe. Data from neighboring die at laser and multiprobe were also collected from all good units. Good units are those that pass all the way through fmal test. Good unit data was collected in order to compare to the failing unit data. Neighboring die are those that in theory affect a reference die's yield based on proximity to the reference die and/or their relationship with the reference die. Because of their proximity or relationship to the reference die, neighbors should have the same type of process step performed on them, and should be very similar in parametric and electrical characteristics. The neighboring die of interest are the nearest eight surrounding neighbors, the nearest twenty-four surrounding neighbors, the radial neighbors, and the same X, Y coordinate on different wafers. The nearest eight neighbors are those die that immediately surround the reference die, forming a square on the wafer. The same immediate neighbors are used for the twenty-four nearest neighbors, however the next die further out forming another box are used as well. Radial neighbors are those that lie in the same circular distance from the center ofthe wafer. A reference die's neighbors in this instance are about the same distance from the center as the reference die, where the reference die is included in its particular circle. For the X, Y neighbors, all die that are in the same X, Y coordinate on all the remaining twenty-three wafers in the same lot are considered neighbors. The same 28 yield number is used for each X, Y coordinate for each reference die in this group. These neighbor definitions can be seen in Figure 3.1. Unit level predicted yield (ULPY) was also used to determine if a better correlation can be found, as derived by Intel. This method combines XY yield and local yield in order to form a single die level predictor that is possibly more powerful than either one used individually [9]. Different combinations of yield methods were explored in order tofindthe most efficient predictor. Radial Die about the same distance from the center as the reference die are its neighbors Nearest Neighbors Immediately surrounding die -8 surrounding die -24 surrounding die Ww i1^^ •_ 11 1 •I • XY Yield 24 w afers, in same lot, neigh t)ors are 23 oth er wafers with the 5>ame [, Y coordinate K Figure 3.1. Neighbor Definitions. In order to determine whether the good units had a better chance of passing at final test compared with the failing units, a yield number needed to be derived based on neighboring die for each reference die. Once yield numbers were derived for each 29 reference die, bad and good, based on all the neighboring methods, these numbers could be compared with each other. The goal was to find a promisingfrendthat can support a new pass/scrap specification, and/or bum-in reduction for those die that have high predicted yields. A promisingfrendwould be one where good die have higher yield numbers than failing diefromfinal test. The methodology is outlined in Figure 3.2. 20 lots ~ 60,000 units Good Units from Final Test Failtires from Final Test i Assign predicted yield number 0-100 based on neighboring die yield at Laser, MP Assign predicted yield number 0-100 based on neighboring die yield at Laser, MP Compare failing units with good units Figure 3.2. Methodology for Neighborhood Analysis. Deriving Yield Numbers In order to make the tracking of data possible, there is a unique die identification number associated witii each device. This die ID is located on-chip witiiin each die. This number contams the die lot number, the wafer number, and its X, Y coordinates. The die ID number is located witiiin on-chip fuses that are blown at multiprobe with the unique 30 number. This makes wafer mapping possible, and is a good way tofrackfailures and good units with their bins throughout the fabrication and testing processes. This also enablesfrackingofall die information giving the ability to predict yield based on neighbors [7]. Using each device's unique die ID, data was collected from laser and multiprobe on all neighboring die for failing devices and good unitsfromfinaltest. Yield numbers were derived for each die in the project. Different methods were used to derive yield numbers for each neighboring definition, including nearest eight and twenty-four neighbors, radial neighbors, XY yield, and ULPY. Tofindyield numbers for the nearest eight neighbors, the distancefromthe reference die needed to be included in the derivation. The closer the neighbor is, the more effect it should have on the reference die. The further away it is, the less Ukely it will have as sfrong an effect as a die that is closer. Distancefromthe reference die was calculated using measured distance for X and Y directions. Figure 3.3 indicates X and Y direction measurements. The Pjihagorean Theorem was used tofindthe distancefromthe reference to diagonal die using measurementsfromX and Y distances. Figure 3.4 shows the orientation of neighboring die to the reference die, including references to neighbors in X, Y, and Z directions. 31 Height Y ^ r Die -t • -p. Width X Figure 3.3. Height and Width Measurements for a Die. Figure 3.4. Orientation of Neighboring Die to Reference Die. Equation 3.1 shows how the Pythagorean Theorem was used to calculate distance Z, where X and Y were the respective measured distances. The distancefromthe reference die to neighboring die in the X and Y directions is simply the width and height ofthe reference die itself, respectively. This is due to the fact that the measurement from 32 tile center ofthe reference die to the center ofthe neighboring die is the same as the width and height ofthe reference die. Distance Z = Vz^+F^ (3.1) Now that distances have been identified in all three directionsfromthe reference die to its nearest eight neighbors, an equation needed to be found and used in order to calculate predicted yield numbers. This equation was used tofindthe weighted yield of each neighbor. Using the yield ofall eight neighbors and including the distancefromthe reference die, an appropriate weighted yield number could be calculated for each neighbor. Equation 3.2 shows how weighted yield numbers were created. weighty = yield ^ ^ *\00 ^ ^ ' ' • ' ' In Equation 3.2, n represents the individual neighbors, numbered 1 through 8. The yield in the numerator is the yield ofthe neighboring diefromlaser or multiprobe. If the neighbor was good at laser or multiprobe, it would receive a value of 1 for yield. If it was bad, it would receive a value of 0. This yield number is divided by the distance that neighbor is from the reference die, thus accountmg for its impact on the reference die due 33 to the proximity to it. The distance is presented using the Pythagorean Theorem, which will calculate the distance for each neighbor from the reference die, no matter which neighbor it is. This number is now divided by tiie sum ofall eight neighbors when they are good (yields = 1) divided by their respective distances from the reference die. This denominator represents the total yield possible. In other words, the denominator represents perfect yield for all neighbors, so yields of 1 are divided by all eight distances and summed. When each neighbor is divided by it one at a time, it will give the portion ofthe whole that a particular neighbor contributes toward the total predicted yield for the reference die. Finally, this weighted quantity is multiplied by 100, in order to have a predicted yield percentage between 0 and 100. The weighted yields from each neighbor are added together in order to get the predicted yield for the reference die. This means each reference has a predicted yield value that is the summation of two die in the X dfrection, two die in the Y direction, and four die in the Z direction. When all eight neighbors are good, the total weighted yield will be 100. This is seen in Equation 3.3. In theory, a die with a predicted yield of 100 has a 100% probability of passing. Predicted Yield Number = 2{X) + 2(7) + 4(Z) = 100 (3.3) In order to better understand the predicted yield derivation for nearest neighbors, an example using normalized quantities is used. For this purpose, the distance from the reference die to the X neighbors is 1, and the distance to the Y neighbors is 1. Using the Pythagorean Theorem, the distance to neighboring die in the Z direction is V2 , as shown 34 in Equation 3.4. Orientation and values for neighboring die for this normalized example is shown in Figure 3.5. Distance to neighboring die in Z direction = Vl^ +1^ = V2 (3.4) Distance in Z direction: V2 Figure 3.5. Orientation and Values for Normalized Example. The normalized values for distances for the eight neighbors are used in Equation 3.2 in order tofindthe weighted yield contribution each neighbor gives to the reference die. The equations used to find the weighted yield values for die in the X, Y, and Z directions are shown in Equations 3.5, 3.6, and 3.7, respectively. * 100 = ^ ^ = 14.645 6.828 X Direction = 4 2 2 +_ +_ y/2 1 1 35 (3.5) y Direction = — ^ J * 100 = ^ ^ = 14.645 ^O^TF 6.828 ~4 2 2 V2 1 I 1 70 355 * 100 = ^^^^^^^ = 10.355 Z Du-ection = —, ^1^77 ~^ V2 (3.6) (3.7) 6.828 2 2 1 1 The weighted numbers are added to the total predicted yield value for a reference die when a neighbor is good. If a neighbor is bad, it will contribute a value of 0. The weighted yield values for each neighbor are added up so that there are two neighbor values in the X dfrection, two neighbor values in the Y dfrection, and four neighbor values in the Z direction, as seen in Equation 3.8. If all eight neighbors are good in this case, the predicted yield will be 100, as seen in Equation 3.9. X + X + Y + Y + Z + Z + Z + Z= Weighted Yield Value (3.8) 14.465 + 14.465 +14.465 +14.465 + 10.355 +10.355 + 10.355 + 10.355 = 100 (3.9) The weighted numbers derived for each neighbor are used throughout the rest of the 8 nearest neighbor analysis, rather than deriving them individually every time. For example, if neighbor X is good, it will contribute a value of 14.645 to the summation for 36 tile predicted yield for the reference die. If neighbor X is bad, it will be summed as a 0 to the predicted yield. This method is used for analysis involving laser and multiprobe neighbors. Determining the yield numbers for the twenty-four surrounding neighbors was done in a similar way, except that the distance for all twenty-four immediately surrounding neighbors was calculated as well. The Pythagorean Theorem was used for this purpose as well. Figure 3.6 depicts a grid in which a reference die has twenty-four neighbors. B E D ^ Z A X D z Ref E Z D X A >.,«.. ^_:. ...... E • B E D C Figure 3.6. Reference Die with Twenty-Four Nearest Neighbors. The equations used for the nearest eight neighbors could be used once again from the derivation ofthe nearest eight neighbors' distances. For neighbors of distance A, B, 37 C, D, and E, Equations 3.10 through 3.14 could be used. The Pythagorean Theorem was used for distances C, D, and E. A = 2X (3.10) B = 2Y (3.11) C = 4^FV¥ (3.12) D = 4Y^~^ (3.13) E^ylB^+X^ (3.14) Equation 3.2 used for the calculation ofthe nearest eight neighbors is used again in order tofindthe weight ofthe nearest twenty-four neighbors. Once again, if all neighbors are good, the predicted yield would add up to 100, as shown in Equation 3.15. 2X-^2Y + AZ + 2A + 2B + 4C-\-4D + 4E = l00 (3.15) For the next neighbor definition, XY yield, a simple arithmetic mean is used. Each XY coordinate has twenty-three neighbors on other wafers in the same lot at that 38 same coordinate. The yield number calculated is used for each XY position within each lot. If a neighboring die is good, it receives a yield number of 100, where if it is bad it receives a 0 when calculating the mean. The sum of these yields are divided by twentythree, the total number of neighboring die at that XY coordinate in a lot, as seen in Equation 3.16. 23 Y^Yield, 23 (3.16) In order to find predicted yield numbers for radial analysis, first the distance from the center of a wafer to each die on the wafer needed to be found. After identifying the center ofthe wafer by direct measurement, the Pythagorean Theorem was used to find the distances to each die on the wafer. Then eight groups were identified that had similar distances from the center ofthe wafer. These eight groups form circles from the center. Any die that lies within a cfrcle is considered a neighbor to another die on the cfrcle. Figure 3.7 shows the distances calculated from the center and the group in which each radial circle lies. 39 Groups 10 11 12 13 Figure 3.7. Radial Distances (mm) and Groups. The yield number for radial analysis was calculated depending on which group a reference die belongs to. The yields ofall neighbors were determined, summed, and divided by the number of members in each group. This is a simple averaging method where X is the number of neighbors in a group, as shown in Equation 3.17. X ±Yield, 1 (=0 X (3.17) For Intel's unit level predicted yield, or ULPY, a geometric mean was used. Two quantities were used, XY yield and local yield, or nearest neighbor yield. These quantities were combined in order to create a more powerful predictor than either one can 40 offer individually. Ratiiertiianusing an aritiimetic mean which would not sufificientiy penalize die with very low predicted yields, the geometiic mean was used. A number between 0 and 100 is calculated which gives a more accurate predicted of whether or not a die will pass. This is shown in Equation 3.18. ULPY = ylLocalYield*XYYield (3.18) Intel was able to show through empfrical studies that weaker die were predicted to fail successfully using ULPY rather than usingtiiearithmetic mean [9]. For this project, both eight and twenty-four nearest neighbors were used as local yield along with XY yield. An important aspect to consider when looking at neighborhood analysis is the consideration of edge die. Edge die are die that are located on the edge of a wafer and do not have the full eight or twenty-four surrounding neighbors. Figure 3.8 shows edge die missing some of their nearest neighbors. One way to approach this problem is to consider that edge die are more likely to fallout and that the absence of die surrounding them should not increase thefr predicted yield. In this regard, the nussing die should be omitted and effectively have a yield of 0 when making predicted yield calculations. Another way to approach the issue is to understand that missing die caimot increase the predicted yield, but are not there to impact the die in one way or another. Therefore, iffiveout of eight surrounding die are missingfroman edge reference die for example, only thefivedie should be included in 41 tile yield number derivation. This impacted the nearest eight and nearest twenty-four neighboring defmitions. Fortiiisproject, botii approaches to edge die were considered and compared. / /' / / Ref / \ 1 1 * Ref '"' X Figure 3.8. Edge Die Consideration. 42 ! * CHAPTER IV NEIGHBORHOOD ANALYSIS RESULTS Infroduction This chapter will concentrate on the results ofthe neighborhood analysis technique as described in the previous chapter. The neighbor definitions used were developed by Intel, including local yield (eight and twenty-four nearest neighbors), radial yield, XY yield, and ULPY (XY yield along with eight and twenty-four nearest neighbors). The charts with resulting data to be shown are indicated in Figure 4.1. All Failures F rom Final Test Neighbors From Laser Missing = 0 Missing Not Used Neighbors From Multiprobe Missing = 0 Missing Not Used 8NNs ^^[^^m^, 24NNS Radial N/A N/A XY N/A N/A ULPY - 8 N/A N/A ULPY - 24 N/A N/A Figure 4.1. Charts with Resulting Data Shown in Chapter IV. In Figure 4.1, all failures refer to allfinaltest failures from every log point in the test flow. This includes pre bum-in, post bum-in, and quality assurance. "Missing = 0" and "missing not used" refer to missing edge die. Charts that show datafromnearest 43 neighbor calculations show both approaches to freating edge die. The fnst method is calculating predicted yields when missing edge die are counted as 0 (missing = 0). This calculation did not apply to radial analysis, XY yield, ULPY 8, and ULPY 24 because these calculations involved die that remain on the wafer and not missing edge die. The second way to freat missing edge die is to omit them from calculations. When a chart says "missing not used," this means that missing edge die were not counted toward the predicted yield number. This approach was used for every method of calculating predicted yield, since every method involved die on the wafer. Nearest neighbor calculations were the only methods that made the distinction between ways to freat missing die. The predicted yield numbers for each reference die from final test for each method were calculated. The reference die used are good units and failing units from final test, from all twenty lots, which was about 14,000 units. There were about 3,000 bad die and about 11,000 good die. All die from the 20 lots, about 63,000 units, were used as wafer level neighbors in calculating predicted yields from laser and multiprobe. The predicted yields for each reference die were grouped into tens. This means predicted yields from 0 to 10,10 to 20, and 20 to 30, etc., were grouped together. The predicted yields comprised ten different groups often. For each prediction method, for laser and multiprobe, the percentage of good die and failing die in each group often were calculated. Good die and failing die were compared at each group often in this way. The total number of good die were added to the total number of failing die at each group. Then, the number of good die were divided by the total number to determine the 44 percentage of good die in that particular predicted yield group. The failing die were then divided by the total in order to determine the percentage of failing die. This determined the percentage of good die and failing die at each predicted yield group, showing whether good or failing die dominated the group. In other words, the two percentages (good and failing die) at each predicted yield group often added up to 100%. Thefrendof data worth noting is whether there are higher percentages of good or failing die at each predicted yield group. A goodfrendto see that would support the goals of this project would be a substantial amount of failing die at low predicted yields, with no good die at those points. This would support throwing away die with low predicted yields before they are processed further and packaged. Another good frend would be to see a high amount of good die with higher predicted yields, with no failing die present. This would support reduced bum-in of die with high predicted yields, based on the confidence that there would be no failing die present based on this data. Neighborhood Analysis for Local Yield Figure 4.2 shows the results for eight nearest neighbors at laser, where missing edge die were counted as zero. In this case, there were more die at predicted yields of 0 100, however there were also good die present at the same predicted yields. Even though there were more failing die, this does not mean all die with predicted yields between 0 and 100 could be thrown away, or scrapped, because there were too many good die that would also be thrown away. The same is tme for higher predicted yields. Even though there were more good die at high predicted yields, there remained failing die with the 45 same yield prediction. Bum-in caimot be reduced for these predicted yields due to the fact that failing die would receive the same reduced bum-in based on this data. The failures may not be identified after bum-in and failing units could possibly be shipped. 0-10 10-20 20-30 30-40 40-50 50-60 60-70 70-80 80-90 90-100 Predicted Yield Figure 4.2. Percent of Failures versus Good out of Total Die at Each Predicted Yield Group, 8 Nearest Neighbors (Missing Edge Die = 0) at Laser. Figure 4.3 shows results for eight nearest neighbors at multiprobe, where missing edge die were counted as zero. So, if there were eight nearest neighbors and the reference die from final test was on the edge oftiiewafer, the missing die at the edge were counted as zero. The equation divided the neighbor yields by the total possible 46 yield, where the missing edge die were counted as zero, and these were summed. This would downgrade the edge die accordingly, as edge die are usually less reliable in semiconductor devices. 0-10 10-20 20-30 30-40 40-50 50-60 60-70 80-90 90-100 Predicted Yield Figure 4.3. Percent of Failures versus Good out of Total Die at Each Predicted Yield Group, 8 Nearest Neighbors (Missing Edge Die = 0) at Muhiprobe. The following charts depict the data taken from the same nearest eight neighbors, only the missing edge die were omitted rather than counted as zero. Omitting missing die means if there were only five intact neighbors on a wafer when using the eight nearest neighbor method, those five neighbors only were used in the calculation. This was to show if there was any difference in a natural downgrade of edge die, where the missing 47 die are neglected, and a more dfrect downgrade where the missing die are counted as zero. The data taken using eight nearest neighbors from laser is shown in Figure 4.4. 0-10 10-20 20-30 30-40 40-50 5060 60-70 7040 80-90 90-100 Predicted Yield Figure 4.4. Percent of Failures versus Good out of Total Die at Each Predicted Yield Group, 8 Nearest Neighbors (Missing Edge Die Omitted) at Laser. The data observed using neighborsfrommultiprobe at wafer level can be seen in Figure 4.5. Again, the eight nearest neighbors were considered neighbors, and the missing edge die were omittedfromthe calculation. It should be noted that there were more die vnth predicted values of 0 in multiprobe calculations. This is not apparent from these charts, as only the percentage of die at each predicted yield group was considered, and not actual raw numbers. More predicted values of 0 at multiprobe are due to the fact 48 tiiat failing die from laser were counted as 0 at multiprobe as well. This is because only tiie good and repairable diefromlaser are tested at multiprobe, and all other failing die are counted as 0. There would be more failing die on the wafer, and a higher chance of having no passing neighbors. S 60 o 40 a. 30 0-10 10-20 20-30 40-50 50-60 60-70 70«» 80-90 90-100 Predicted Yield Figure 4.5. Percent of Failures versus Good out of Total Die at Each Predicted Yield Group, 8 Nearest Neighbors (Missing Edge Die Omitted) at Multiprobe. The twenty-four nearest neighbors were analyzed in the same way as the eight nearest neighbors, where both missing edge die considerations were calculated. First, the data for the twenty-four nearest neighborsfromlaser is shown in Figure 4.6, where edge 49 die were counted as zero. Failing die had a higher percentage of units at predicted yields between 0 and 20. Good die dominated the percentages at all other predicted yields. 0-10 10-20 2^30 30-40 40-50 5060 60-70 70W 80-90 90-100 Predicted Yield Figure 4.6. Percent of Failures versus Good out of Total Die at Each Predicted Yield Group, 24 Nearest Neighbors (Missing Edge Die = 0) at Laser. There was no data at predicted yields between 90 and 100 for nearest twenty-four neighbors at multiprobe, as shown in Figure 4.7. This is similar to the previous discussion explaining why there were more predicted yields of 0 for all die when multiprobe neighbors were considered. There were in general lower predicted yields for data using multiprobe neighbors in general due to fallout from laser. That is why there 50 were not many higher predicted yield values. Also, good die had a higher percentage of unitstiianfailing die at all predicted yield groups. 0-10 10-20 20^ 3O40 40-50 50-«0 70«) 80-90 90-100 Predicted Yield Figure 4.7. Percent of Failures versus Good out of Total Die at Each Predicted Yield Group, 24 Nearest Neighbors (Missing Edge Die = 0) at Multiprobe. The chart shown in Figure 4.8 shows the nearest twenty-four neighbors at laser where missing edge die were omitted. The actual number of units with yield numbers between 0-10 and 10-20 was small, so thefr difference is not as substantial as it seems. In other words, the fact that there were more good units at predicted yields of 0-10 and there were less good units at 10-20 does not signify a problem when considering actual numbers of units. There were more units considered as predicted yield values increase. 51 Between predicted yields of 20 to 100, good die had higher percentages of units than failing die. 0-10 10-20 20-30 3O40 4050 50-60 Predicted Yield 60-70 70-80 80-90 90-100 Figure 4.8. Percent of Failures versus Good out of Total Die at Each Predicted Yield Group, 24 Nearest Neighbors (Missing Edge Die Omitted) at Laser. Figure 4.9 shows the nearest twenty-four neighbors from multiprobe when missing edge die were omitted. Once again, there were no predicted yields from 90 to 100 for any die. Also, good die had higher predicted yields than failing die at every predicted yield group often. 52 0-10 10-20 20-30 40-50 50^ 60-70 70«) 80-90 90-100 Predicted Yield Figure 4.9. Percent of Failures versus Good out of Total Die at Each Predicted Yield Group, 24 Nearest Neighbors (Missing Edge Die Omitted) at Multiprobe. Neighborhood Analysis Using Averaging Methods The neighborhood analysis techniques that use averaging methods included XY yield and radial analysis. Since these methods utilized all die on the wafer, there was no concem regarding missing edge die. All die involved remained on the wafer in question, so there was no need tofreatmissing edge die. Thefirstdata shown is XY yield at laser in Figures 4.10. This method seemed to chart the best so far, with failing die dominating the predicted yield groups until a predicted yield of 30. Good die were represented with more die than failing die at predicted yields from 30 to 100. This data was also good because for good die, the 53 percentage of die at each group increased with increasing predicted yield numbers. Failing die percentages decreased with mcreased predicted yield numbers. This is what would be expected for XY yield, since each die in each XY coordinate should be freated the same in the several different processes used in order to fabricate the chips. 0-10 10-20 20-30 30-40 40-50 50«) 60-70 70*0 80-90 90-100 Predicted Yield Figure 4.10. Percent of Failures versus Good out of Total Die at Each Predicted Yield Group, XY Yield at Laser. XY yield using neighborsfrommultiprobe is shown in Figure 4.11. There were no die at predicted yieldsfrom80 to 100, since neighbors at multiprobe have seen fallout previously at laser tests. There was no dominance of failing die at low predicted yields. Good die had higher percentages of die at each predicted yield group. However,tiiegood 54 die percentages did increase with increasing predicted yields, and failmg die percentages did decrease with increasmg predicted yields. 0-10 10-20 20-30 3O40 40«) 50-60 60-70 70-80 80-90 90-100 Predicted Yield Figure 4.11. Percent of Failures versus Good out of Total Die at Each Predicted Yield Group, XY Yield at Multiprobe. The data for radial analysis follows, starting with neighborsfromlaser in Figure 4.12. There were the same number of good die and failing die at predicted yieldsfrom010. Good die percentages were higher than failing die percentages from 10 to 100. Once again, this method seemed to follow the same trend, where good die percentages increased with increasing predicted yields and failing die percentages decreased with increasing predicted yields. This would make sense, since the different process steps 55 used to fabricate ICs may have affected die the same distancefix)mthe center ofthe wafer. 0-10 10-20 2^30 3040 WSO 50-60 60-70 70-80 80-90 90-100 Predicted Yield Figure 4.12. Percent of Failures versus Good out of Total Die at Each Predicted Yield Group, Radial at Laser. Radial neighborsfrommultiprobe are shown in Figure 4.13. There was no dominance of failing die at any ofthe predicted yield groups. 56 100 -r 0-10 10-20 20-30 30-40 4O50 5060 60-70 70«) 80-90 90-100 Predicted Yield Figure 4.13. Percent of Failures versus Good out of Total Die at Each Predicted Yield Group, Radial at Multiprobe. Neighborhood Analysis Using the Geometric Mean The geometric mean was used to calculate Intel's proposed method of unit level predicted yield, ULPY. ULPY is calculated using both eight and twenty-four neighboring methods along with XY yield. This method neglects missing edge die, due to the fact that XY yield uses only die present on the wafer, so only die on the wafer was used in the local yield calculations that are used in ULPY. The data calculated for ULPY using XY yield and 8 nearest neighbors from laser is shown in Figure 4.14. Failing die percentages dominated good die at predicted yields 57 of 0-100 and 20-30. Good die percentages were higher at every other predicted yield group. 0-10 10-20 2O30 30-40 40-50 50-60 Predicted Yield 60-70 7O60 80-90 90-100 Figure 4.14. Percent of Failures versus Good out of Total Die at Each Predicted Yield Group, ULPY (XY Yield and 8 Nearest Neighbors) at Laser. Figure 4.15 shows data using ULPY at multiprobe, using XY yield and 8 nearest neighbors. Since this data used neighbors from multiprobe, there were no die predicted with yields between 80 and 100. 58 0-10 10-20 20-30 3040 AO-50 50-60 60-70 70«) 80-90 90-100 Predicted Yield Figure 4.15. Percent of Failures versus Good out of Total Die at Each Predicted Yield Group, ULPY (XY Yield and 8 Nearest Neighbors) at Multiprobe. The data for ULPY using twenty-four nearest neighbors and XY yieldfromlaser is shown in Figure 4.16. This data fell differently than the others, since there was a huge number of die with predicted yields between 0 and 100. The number of die slowly started increasing at predicted yields of 10-60, topped out somewhere around 70, and started to decrease again to 100. 59 II •"III 0-10 10-20 20-30 30-40 40-50 50«) 60-70 7080 80-90 • Fails • Good 90-100 Predicted Yield Figure 4.16. Percent of Failures versus Good out of Total Die at Each Predicted Yield Group, ULPY (XY Yield and 24 Nearest Neighbors) at Laser. The data for ULPY using twenty-four surrounding neighbors and XY yield from multiprobe is shown in Figure 4.17. This data looks like other charts using multiprobe neighbors, where no die existed at predicted yields between 80 through 100. This is the only data where there was 100% good die at the highest predicted yields, which was between 70 and 80. This seemed to support reduced bum-in for those good die with high predicted yields since no failing die were present. However, the actual number of die at this predicted yield group was so small and insignificant that reduced bum-in could not be supported. 60 120 0-10 10-20 2O30 3O40 40-50 50«) Predicted Yield 60-70 70«) 80-90 90-100 Figure 4.17. Percent of Failures versus Good out of total Die at Each Predicted Yield Group, ULPY (XY Yield and 24 Nearest Neighbors) at Multiprobe. Large Die Size Consideration A possible explanation ofthe results offindinga goodfrendversus concrete bumin reduction based on data might be due to the large die size ofthe device used in this investigation. A large die has fewer die per wafer as compared to a small die on the same size wafer. This results m a yield decrease for the large die. In order to compare this experiment with others done on smaller die, the wafer would have to be X times bigger, where X istiiescalefromtiiesmall die size totiielarger die size. This would make tiie die per wafer equal, and would also increase yield for the large die wafers. 61 In order to better understand how yield is affected by die size, defect density should first be defined. Defect density is the total number of defects divided by the total area ofthe wafer, for each wafer. For a die with area A and yield Y, a defect density value D is calculated using Equation 4.1 [11]. D=^^ (4.1) This equation is usefiil to use when the defect density value and the area are known in order to understand yield differences with varying die areas. The following example demonsfrates tiiis effect. Suppose tiiere are two 8-mch wafers, contaming 100 and 1,000 die/wafer, respectively. The total area ofthe wafer is then 50.3 in . The individual die areas are 50.3/100 = 0.503 in^ (4-2) and 50.3/1,000 = 0.0503 inl Suppose tiie defect density is 1 defect / m\ When calculating yield for each wafer, the form of Equation 4.4 is used, as modified from Equation 4.1. 62 (4-3) Y = \-DA (4.4) After plugging in the values for D and A in Equation 4.4, the yield for the wafer witii 100 larger die is 49.7%, whiletiieyield fortiiewafer witii 1,000 smaller die is 95.0%. This presents clearly that larger die naturally have lower yield using the same process on the same size wafer than smaller die. With lower yield, it will be more difficuh tofinda definite correlation of laser and multiprobe neighbor yields andfinaltest yield. What has been found is afrend,where good die have higher predicted yields than bad die, and failing die have a higher percentage of low predicted yields than good die. Chapter V expands on this conclusion. 63 CHAPTER V NEIGHBORHOOD ANALYSIS CONCLUSIONS The ultimate goals of this tiiesis were to find if bum-in could be reduced for die that had high predicted yields and if further unnecessary processing and packaging could be eliminated for die that had low predicted yields. It was seen from the several charts presented in Chapter IV featuring several different methods of calculating predicted yields based on wafer level neighbors that there was a general frend. This frend indicated that a higher percentage of good die were likely to have higher predicted yields than bad die, and bad die were likely to have a higher percentage of failing die at lower predicted yields. This supported the idea that good die should have a higher amount of good neighboring die and bad die would have more failing neighbors. Unfortunately, there was not enough good die with substantially high predicted yields where failing die did not exist. For example, there was no method that resulted with 15% total die with a predicted yield of 100 and 0% bad die with a predicted yield of 100. This situation would support a bum-in specification change, where the predicted die with yield numbers of 100 would undergo reduced bum-in based on the confidence that they would pass final test, more specifically post bum-in. In addition, there was no case where failing die had very low predicted yields where good die were not present with the same low yield prediction. This means there was no chance of scrapping low yielding die, thus saving money by not processing bad 64 dietiiroughfmal test and scrappingtiiemoncetiieyfail, fri most cases,tiierewere good and bad die associated with every predicted yield number. Possible explanations of these results are the fact that a large die has been used. Larger die have a better chance of having low yielding wafers when compared with smaller die. Two wafers ofthe same size with the same process and thus the same defect density with different numbers of die per wafer will have different yields, where the wafer with larger die will have lower yield. This means finding dramatic results from neighborhood analysis will not be possible. A smaller die would have more promising results, with neighbors in smaller proximity, catching more defects on the wafer. A large die, for example, could have three defects within its entfrety and be considered a failure, where four small die could have each defect within one die, and have three out of four failures. This would catch more defects, and cause the results of a neighborhood analysis to be more efficient and accurate. Another possible factor to consider is sample size. While other industry applications have looked at millions of die, this study considered about 60,000. In addition to large die size, a smaller sample may have hurt the results to where no defmite bum-in reduction method could be found. In conclusion, while the method within this thesis has been proven effective on other devices, it could not be used to reduce bum-in for die witii high predicted yields, or used to scrap die with low predicted yields. Good and bad die botii followed tiie same frend witiiin predicted yield numbers, where good die did have higher predicted yields tiian bad die. However, there were too many bad die that followed the samefrendand 65 had too many at high predicted yields as well. This does not facilitate confidence to suggest a specification change as to bum-in time. This method may be effective when a larger sample is used on large die, or when a larger wafer size is attained for this product. Other methods should be considered along with neighborhood analysis for large die in the fiiture. Parametric data could be considered along with this method, as this data has been used tofinddefective process conditions and design marginalities for large die. Neighborhood analysis, along with parametric data and a larger wafer, shouldfindits place in bum-in reduction methods in the future for large die, based on the promising frend found through this thesis. 66 REFERENCES 1. Bums, Mark and Roberts, Gordon W. An hitroduction to Mixed-Signal IC Test and Measurement. New York, New York: Oxford University Press, 2001. 2. Wolf, S. Microchip Manufacturing. South Beach, Califomia: Lattice Press, 2004. 3. Bamett, Thomas S., Singh, Adit D., Nelson, Victor P. "Extending IntegratedCircuit Yield-Models to Estimate Early-Life Reliability." IEEE Transactions on Reliability Vol. 52, No. 3, September 2003. 4. Bamett, Thomas S., Singh, Adit D., Nelson, Victor P. "Bum-ln Failures and Local Region Yield: An Integrated Yield-Reliability Model." Proceedings ofthe 19^ IEEE VLSI Test Symposium p. 0326, March 29 - April 03,2001. 5. Pradhan, Dhiraj K. Integrated Circuit Manufacturability The Art of Process and Design Integration. Ed. Jose Pineda de Gyvez. New York, New York: Institute of Electrical and Electronics Engineers, Inc., 1999. 6. Black, Kelley A. "Die Level Sorting of an Integrated Circuit." Master's Thesis, Texas Tech University, Lubbock, TX, 2000. 7. Information obtained through industry sponsor. 8. Sabade, Sagar S., Walker, Duncan M. "Evaluation of Effectiveness of Median of Absolute Deviations Outiier Rejection-based IDDQ Testing for Bum-in Reduction." Proceedings ofthe 20^ IEEE VLSI Test Symposium p. 0081, April 28 - May 02,2002. 9. Miller, Russell B., Riordan, Walter C. "Unit Level Predicted Yield: a Metiiod of Identifying High Defect Density Die at Wafer Sort." ITC Intemational Test Conference Paper 40.3, p. 1118,2001. 10. Riordan, Walter C , Miller, Russell, Sherman, John M., Hicks, Jeffrey. "Microprocessor Reliability Performance as a Function of Die Location for a 0.25^1, Five Layer Metal CMOS Logic Process." 37*^ Annual hitemational Reliability Physics Symposium p. 1,1999. 11. Hess, Christopher, Weiland, Larg H. "Wafer Level Defect Density Distribution Using Checkerboard Test Stmcttires." IEEE 1998 Int. Conference on Mirrnelfirtronic Test Stiiictiires Vol. 11, March 1998. 67 12. <htix>://dictionary.reference.com/> 13. Sfreetinan, Ben G., Baneijee, Sanjay. Solid State Elecfronic Devices. Upper Saddle River, New Jersey: Prentice Hall, Inc., 2000. 68 PERMISSION TO COPY In presenting this thesis in partial fulfillment of the requfrements for a master's degree at Texas Tech University or Texas Tech University Healtii Sciences Center, I agree that tiie Library and my major department shall make itfreelyavailable for research purposes. Permission to copy this thesis for scholarly purposes may be granted by the Dfrector of the Library or my major professor. It is understood that any copying or publication of this thesis for financial gain shall not be allowed without my further written permission and that any user may be liable for copyright infiingement. Agree (Permission is granted.) Student Signature Date Disagree (Permission is not granted.) Sttident Signattire Date