PREDICTION OF PARAMETRIC BASED FINAL TEST FALLOUT

advertisement
PREDICTION OF PARAMETRIC BASED FINAL TEST
FALLOUT USING NEIGHBORHOOD ANALYSIS
by
SUZY M. BROWN, B.S.
A THESIS
IN
ELECTRICAL ENGINEERING
Submitted to the Graduate Faculty
of Texas Tech University in
Partial Fulfillment of
the Requirements for
the Degree of
MASTER OF SCIENCE
IN
ELECTRICAL ENGINEERING
Approved
Clfiirperson of the Committee
Accepted
Dean of the Graduate School
May, 2004
ACKNOWLEDGMENTS
I would like to take the opportunity to thank the many people involved with the
successful preparation and completion of this thesis, and for their support during my
academic career.
To my academic advisors and professors at TWU and TTU, I am very
appreciative for all your help and guidance throughout my imdergraduate and graduate
careers. Specifically I would like to thank Dr. Parten, Dr. Cox, Dr. Edwards, and Dr.
Thompson.
I am gratefully indebted to my industry sponsor, who helped me acquire the skills
and material necessary to complete this thesis. Specifically, 1 would like to thank C. Hu,
M. Chang, and B. Campbell for supplying the assistance and tools necessary to get my
work done. I would like to thank the group entirely for helping me to have a successful
internship. Also, I would like to thank my industry sponsor for the financial support
necessary to complete my degree.
I want to thank my family andfiiendsfor their support, both emotionally and
financially. Thank you for yoiu- understanding, patience, and guidance.
11
TABLE OF CONTENTS
ACKNOWLEDGMENTS
ii
ABSTRACT
v
LIST OF FIGURES
vi
CHAPTER
I.
II.
III.
INTRODUCTION
1
The Need for Reliable Semiconductor Products
1
Outline of Chapters
4
TESTING MICROPROCESSORS
5
Device Fabrication
5
Defect Mechanisms and Types
5
The Need for Test
11
Testing
12
Components of Testing
17
Testing Global and Local Defects
18
Test Considerations
21
DEVELOPMENT OF METHODS FOR NEIGHBORHOOD ANALYSIS 23
Burn-In Reduction
23
Industry Evaluations and Methods for Bum-In Reduction
24
Neighborhood Analysis Methods
25
Deriving Yield Numbers
30
111
IV.
V.
NEIGHBORHOOD ANALYSIS RESULTS
43
Introduction
43
Neighborhood Analysis for Local Yield
45
Neighborhood Analysis Using Averaging Methods
53
Neighborhood Analysis Using the Geometric Mean
57
Large Die Size Consideration
61
NEIGHBORHOOD ANALYSIS CONCLUSIONS
64
REFERENCES
67
IV
ABSTRACT
It is desired to decrease time and money devoted to bum-in and to eliminate
unnecessary processing of defective integrated circuits. Bum-in is a reliability screen
used to isolate poorly performing integrated circuits before they complete testing and are
shipped. A method is used to attempt to predict good and failing semiconductor devices
at final test at an earlier stage in fabrication. Devices with a passing prediction may
qualify for reduced bum-in. This method, neighborhood analysis, uses neighboring die
on the semiconductor wafer on which it was made in order to predict whether it will pass
or fail at final test, before it reaches bum-in. A correlation is investigated of final test
failures against good units to identify whether or not a strong enough trend supports early
scrapping of material, or a possible bum-in specification modification.
LIST OF FIGURES
2.1
Test Flow for IC
13
2.2
Bathtub Curve
16
2.3
Three-Ring Oscillator
20
3.1
Neighbor Definitions
29
3.2
Methodology for Neighborhood Analysis
30
3.3
Height and Width Measurements for a Die
32
3.4
Orientation of Neighboring Die to Reference Die
32
3.5
Orientation and Values for Normalized Example
35
3.6
Reference Die with Twenty-Four Nearest Neighbors
37
3.7
Radial Distances (mm) and Groups
40
3.8
Edge Die Consideration
42
4.1
Charts with Resulting Data Shown in Chapter IV
43
4.2
Percent of Failures versus Good out of Total Die at Each Predicted Yield
Group, 8 Nearest Neighbors (Missing Edge Die = 0) at Laser
46
Percent of Failures versus Good out of Total Die at Each Predicted Yield
Group, 8 Nearest Neighbors (Missing Edge Die = 0) at Multiprobe
47
Percent of Failures versus Good out of Total Die at Each Predicted Yield
Group, 8 Nearest Neighbors (Missing Edge Die Omitted) at Laser
48
Percent of Failures versus Good out of Total Die at Each Predicted Yield
Group, 8 Nearest Neighbors (Missing Edge Die Omitted) at Multiprobe
49
Percent of Failures versus Good out of Total Die at Each Predicted Yield
Group, 24 Nearest Neighbors (Missing Edge Die = 0) at Laser
50
4.3
4.4
4.5
4.6
VI
4.7
4.8
4.9
4.10
4.11
4.12
4.13
4.14
4.15
4.16
4.17
Percent of Failures versus Good out of Total Die at Each Predicted Yield
Group, 24 Nearest Neighbors (Missing Edge Die = 0) at Multiprobe
51
Percent of Failures versus Good out of Total Die at Each Predicted Yield
Group, 24 Nearest Neighbors (Missing Edge Die Omitted) at Laser
52
Percent of Failures versus Good out of Total Die at Each Predicted Yield
Group, 24 Nearest Neighbors (Missing Edge Die Omitted) at Multiprobe
53
Percent of Failures versus Good out of Total Die at Each Predicted Yield
Group, XY Yield at Laser
54
Percent of Failures versus Good out of Total Die at Each Predicted Yield
Group, XY Yield at Multiprobe
55
Percent of Failures versus Good out of Total Die at Each Predicted Yield
Group, Radial at Laser
56
Percent of Failures versus Good out of Total Die at Each Predicted Yield
Group, Radial at Multiprobe
57
Percent of Failures versus Good out of Total Die at Each Predicted Yield
Group, ULP Y (XY Yield and 8 Nearest Neighbors) at Laser
58
Percent of Failures versus Good out of Total Die at Each Predicted Yield
Group, ULPY (XY Yield and 8 Nearest Neighbors) at Multiprobe
59
Percent of Failures versus Good out of Total Die at Each Predicted Yield
Group, ULPY (XY Yield and 24 Nearest Neighbors) at Laser
60
Percent of Failures versus Good out of Total Die at Each Predicted Yield
Group, ULPY (XY Yield and 24 Nearest Neighbors) at Multiprobe
61
vn
CHAPTER I
INTRODUCTION
The Need for Reliable Semiconductor Products
The semiconductor industry is a fast moving business that is impacting the world
in diverse and ever-changing ways. From highly innovative communication tools to
faster and extremely efficient computers, the growing industry faces new challenges
every day. The information age is possible due to the invention by Jack Kilby in 1959 of
the integrated circuit [7]. With this technical conception, circuits today contain millions
of transistors to power the high-speed computers and large-capacity memories of modem
technology.
Highly competitive companies push to get new products and solutions shipped to
customers first into the market and work to build recognition as producers of efficient and
reliable products. In order to be leaders ofthe market and maximize profit, companies
must build a positive reputation with customers. This means if a company sells many
failing devices, or devices that fail too quickly in time, it will not remain in business for
very long. The process of building integrated circuits must be robust and the time to
market must be quick, all the while maintaining quality. In order to maintain quality and
reliabilify, the need for test emerges. Before any product can be sold, it must be tested
with a precise and vigorous method to ensure it will meet the customer's needs.
Each device begins testing at an early stage, when only a few layers ofthe product
have been built. This testing continues while the devices or die are in wafer form, which
is where each die remains uncut on a wafer. A wafer is a thin slice of a semiconductor
substrate where fabrication of an integrated circuit is performed. Wafer level tests consist
of parametric laser probe and multiprobe. The testing process continues after the device
is built and packaged, which is calledfinaltest. Testing provides valuable information
about each device and proves whether the device is good or if inconsistencies or
problems within the process exist causing the product to fail. Testing can also find
process marginalities that might cause unreliability of devices in time or might indicate
where improvement is needed in fabrication. Obviously the need for testing is imperative
and requires constant innovation and test program maintenance in order to continue
selling reliable products.
One way to ensure reliability is to bum-in the device while atfinaltest. Bum-in
is where a device is subjected to higher than normal voltages and temperatures for a
predetermined amount of time in order to speed up the early failure rate period, before the
part is shipped to the customer. After completion of bum-in, failures should be
significantly reduced if not eliminated completely. If such failures take place, this
represents a reliability issue within the process or design. Since bum-in represents the
early failure rate period of a particular device, the devices would not be suitable for
shipment without it, unless it is proven the devices do not require it. Bum-in is an
expensive and time-consuming measure. There is a constant desire within the
semiconductor industry to decrease time at bum-in or delete bum-in altogether in order to
maximize profit by cutting costs.
Test reduction methods are of high importance within any semiconductor
company. To cut costs anywhere in the fabrication process, packaging, or testing is
desirable in order to maximize profit. As previously mentioned, one such way to do this
is by reducing or deleting bum-in. Several approaches can be used in order to
accomplish bum-in reduction. One such approach, as outlined in this thesis, is
neighborhood analysis of devices at wafer level.
Current methods of sorting at the wafer level include scrapping faulty wafers
based on overall wafer yield and continuing to process good wafers. Using this method,
a wafer is divided into smaller sections called zones. The yield ofthe zones on the wafer
will determine if the wafer is passed or scrapped. If a high percentage ofthe zones yield
poorly, the whole wafer is scrapped due to reliability concems. This percentage is predetermined by the engineers working on the device. The idea is that if too many zones on
a wafer are low yielding, the entire wafer would be suspect and it would be best to
discard, or scrap, the entire wafer. If a high percentage ofthe zones yield well, the wafer
is processed minus the failing die.
This thesis considers die level sorting, where only individual die are scrapped
based on their performance and performance of their neighbors, rather than entire wafers.
Neighbors are those die on the wafer that due to their proximity or relationship to a
reference die may affect its yield. A reference die is a device on a wafer that will be
assigned a predicted yield. Die level sorting may eliminate scrapping low yielding
wafers that may have high yielding regions with good die that might be imaffected by the
low yielding portion. By using die level sorting and isolating die with a high predicted
yield based on local yield and other methods, bum-in can possibly be reduced for those
die. Local yield is the yield of a die based on the performance of immediately
neighboring die at wafer level tests. There are other methods used in this project to be
mentioned later in order to determine predicted yields. Those die with high local yield
may be considered for reduced bum-in due to increased confidence that they will pass at
final test based on the yield of their neighbors. Bum-in may be completed at full duration
for those die with poor local yield. Reference die with low predicted yields may be
considered for scrapping before they are processed further and packaged.
Outline of Chapters
Chapter II presents more information about testing semiconductors and the
importance of bum-in reduction. Chapter III describes the method used by this thesis in
order to determine if bum-in reduction is possible utilizing neighborhood analysis.
Chapter IV shows results after carrying out the method described in Chapter III. Chapter
V concludes and discusses whether predictingfinaltest yield based on neighborhood
analysis of a reference die at wafer level test will help in imderstanding if bum-in can be
reduced.
CHAPTER n
TESTING MICROPROCESSORS
Device Fabrication
Semiconductor devices are built as integrated circuits, or ICs. ICs are
microelectronic circuits that are incorporated into a chip of a semiconductor, which is
usually silicon. These circuits are comprised of many interconnected transistors and
other components that operate as a whole system [12]. The fabrication of an IC involves
several photolithographic printing steps, etching, and doping. A comprehensive
introduction of device fabrication can be found in Microchip Manufacturing, by S. Wolf
[2].
Defect Mechanisms and Types
IC stractures are prone to a variety of nonideal physical characteristics that are not
always under the manufacturer's control. These defect mechanisms can be organized
into different origins ofthe defect. There are wafer defects, human errors, equipment
failure, environmental impact, and process instabilities [5].
Wafer defects occur within the bulk material that forms the basis for
semiconductor devices, which is usually silicon. Silicon can become contaminated
easily, and can form micro-cracks. These problems can cause a shift in parametric
properties which can affect the performance ofthe IC or the elements that are arranged in
the area that is infected [5].
Human errors include pollution in the air (e.g., skin cells), scratches on the wafer
due to careless handling, and process steps that have been forgotten or repeated if a step
was not noted thefirsttime it was taken. Every possible effort can be made to automate
any steps that can be computer controlled, or otherwise handled without human
intervention [5]. This can eliminate or cut down on human mistakes, which cannot be
prevented otherwise.
Equipment failure includes air contamination ofthe equipment, and equipment
which has not been tuned properly. The un-tuned equipment can lead to mishandling of
wafers, which can lead to mass failures. The solution to this problem would be strict
maintenance planning, where the equipment is routinely checked and calibrated [5].
Environmental impact includes the air inside the fabrication facility. If the air
contains contaminants that are the same size or bigger than the minimimi feature size of
the current technology being processed, and any of these particles land on the wafer
surface, this can lead to functional problems [5]. This is why clean rooms in fabrication
facilities are kept very clean and monitored closely. The degree of cleemliness is
specified as how many particles are allowed to exist per unit volume of clean room air. A
Class-X clean room contains less than X total particles greater than 0.5 [im in size per
cubic foot of air. So, a Class-1 clean room is allowed to have one particle greater than
0.5 |j,m in size per cubic foot of air [2].
Clean room protocols are designed to decrease the number of particle defects on a
wafer. These include gowning of human workers with a body suit, gloves, hair nets,
inner caps, hood, and booties to insure as much cleanliness as possible. Other protocols
include refi-ainingfi-omsudden movement, running, or jumping, the absence of chairs
which can collect particles, and the use offiber-lesspaper. These protocols have been
developed to keep particlesfi-ommoving through the air and to prevent them fi'om
contaminating wafers [2].
Process instabilities are those defects that are caused by certain process steps.
Some ofthe steps applied are critical and susceptible to variations in process conditions,
which can lead to several various process instabilities. Process instabilities can include
problems with doping profiles, photolithographic printing, etching, wafer siuface
inconsistencies, metal lines, and chemical vapor deposition (CVD).
Doping profiles using N and P regions are not readily distinguishable using
characterization tools. Doping can serve several purposes, mainly being the formation of
the source and drain regions for the transistors. These regions serve as the paths of
current flow in the silicon through the metal interconnect lines and the channels ofthe
transistors [2], There are two types of dopants, N and P types. Current flow is possible
through doping due to the nature ofthe materials. A full account of how N and P type
dopants are used in an IC can be found in Microchip Manufacturing, by S. Wolf [2].
After the process steps are taken in order to introduce dopants into the silicon, there is no
way to determine exactly where the dopant profiles exist. Without knowing where the
profile boundaries exist, the process engineer cannot guarantee an exact process.
However, doping profiles do have an effect on device characteristics. Doping errors can
cause large DC offsets, deformations, etc that can cause performance problems [1].
The photolithographic printing process is another fiagile process and is always
subject to imperfections. It can cause failures or inconsistencies in performance from one
IC to the next ofthe same process. ICs with extremely small features are very sensitive
to any changes or inconsistencies in the process [5]. A misaligned photolithography
mask can cause partial defects, and cause high contact resistance. This can lead to minor
dc offset problems and catastrophic distortion problems. A misaligned mask can also
lead to completely defective vias, which can create a completely open circuit [1].
Fabrication problems can arise during the etching processes of a device. Etching
usually occurs after photolithography in order to create a permanent pattem on the wafer,
creating an exact transfer ofthe image present in the resist layer. If a wafer is under
etched, it can create defective meteil contacts and vias. It can also lead to catastrophic
shorts between circuit nodes [1].
Surface defects can arise due to mishandling of wafers due to human or
equipment error or particles in the air. Particles can arise on the surface ofthe wafer or
on a photolithography mask. This can lead to short circuits between nodes or other
catastrophic consequences. Surface defects also include scratches, broken bond wires,
and surface explosions caused by electrostatic discharge (ESD) in a mishandled device
[1]. Surface defects can be avoided by taking special precautions during the fabrication
of devices.
Metal lines can cause process issues. The process necessary to form metal lines
can create imperfect, not rounded lines. These imperfect lines can lead to parasitic
capacitance between traces and surrounding elements, catastrophic shorts, and
catastrophic opens. As the technologies of today are become increasingly scaled, this
problem can become more apparent. Imperfections in metal and performance
sensitivities only become more exaggerated as device geometries shrink in size [1].
Another important process to consider is chemical vapor deposition, or CVD.
During a CVD process, a material is locally deposited on a wafer to complete a particular
step. Locally deposited materials and variations in layer thicknesses depend mostly on
the constant flow ofthe gas used. If obstacles in the gasflowor around the injection
point are present, a turbulent flow can occur. This can result in major or minor thickness
variations. Variations in thickness no matter the degree can result in parametric
differences across the wafer [5]. It is important to keep properties constant across wafers
within a single lot and across several lots.
There are two classes of defect mechanisms which are worth noting. They are
global and local defects. Global defects are those defect types that affect a large area, or
possibly an entire wafer. The wafer is influenced in the same way in the affected area.
Local defects affect a smaller portion ofthe wafer. It is important to note, however, that
there are other regional classifications when referring to wafers. Zonal areas are
subsections of a global area, or smaller portions ofthe global area. A neighborhood is an
area smaller than a zone. Finally, a local area is smaller than a neighborhood.
An example of a global defect is mask misalignment. A misaligned mask will be
misaligned everywhere on the wafer, and can cause different parametric behaviors.
Another example is line registration errors, where due to changes in etching times lines
are too wide or too narrow. This will also affect the entire wafer, as etching profiles are
not achieved locally, but for the entire wafer. Also, different implantation levels are
considered global defects. Different levels of implantation can cause a shift in transistor
parameters between wafers in a lot, and between lots within a revision of silicon [5].
Local defects include those defects that impact a smaller area on a wafer. In a
photolithographic process, dust particles or pollution by chemicals on a wafer or mask
can affect a smaller portion of a wafer and cause local fallout. Scratches and cracks due
to human error or mishandling by robotic machinery due to un-tuned equipment can also
be classified as local defects as they do not affect the entire wafer's performance [5].
There are two types of faults to consider when looking at either global or local
defects caused by any ofthe above-mentioned possible causes of defects. These faults
are parametric and functional [5]. Parametric faults are determined by in-line parametric
tests performed after the first level of metallization, and again at final parametric test.
These tests are in place to eliminate further processing of wafers that fail in-line
parametric tests after metal layers are deposited. Scrapping of failing wafers at this point
is justified because the chance offindingpassing partsfi'omwafers that fail these tests is
unlikely or not existent. Parametric data can also provide warnings about fabrication
problems. The data obtained can indicate how the wafers were processed and where
problems may exist within the process flow. Correlations can be developed between the
different parameters and thefinalproduct specifications. This can aid in optimization of
fabrication conditions [2].
A parametric fault could be reduced threshold voltage or increased resistances of
coimections. Parametric faults can be caused by either global or local defects. Most
10
global defects result in parametric faults. Specific parametric tests are necessary because
the kind of faults that cause parametric failures may cause a device to fail only
performance related specifications that are not always measured [5].
Functional faults are catastrophic failures in the behavior ofthe device. These
faults can varyfi-oma logic failure of an output under a specified limited input condition
to complete incorrect operation ofthe device regardless of signals applied. For example,
an output connected through a short on a supply line does not change regardless ofthe
input applied. This would indicate complete failure ofthe device [5].
The Need for Test
The problems that arise within the process of semiconductor devices can lead to
several types of defects. Any ofthe above mentioned problems can occur, as well as
several other process complications at any step in fabrication of devices. The best way to
monitor and alleviate process tweaks is to carefully test each product and make sure they
meet product specifications. These defects must be detected through test before the
devices are considered of good quality and can be shipped to the customer. Test points
are intact while the devices remain in wafer form and remain until they are shippable
packaged die. The final test points are intact to insure nothing has changed after the
devices have been packaged.
There exists a long feedback loop for semiconductor devices. The entire
processing required from silicon wafers to packaged integrated circuits is between about
four to six weeks. Problems that occur early in the process could be detected an entire
11
montii later [5]. Each IC involved in a problematic lot could be affected by the same
circumstance, and all could be lost. Early fault detection is necessary in order to alleviate
this sort of situation. Problems detected early can be solved and used to identify process
control decisions. If process problems are detected early enough, failing devices can be
scrapped, thus saving the time and money of processing these die any fiirther. Also,
identifying those units that are good and that may be eligible for reduced bum-in at final
test may help in planning for bum-in oven capacity at a later time. A steady process is
important and any variations can be observed through testing the product and reporting
any yield effects. Through careful monitoring and yield observation, the process and test
flows can be maintained properly and fine-tuned.
Testing
The testingflowused on the material utilized in this thesis is identified from the
first test point until the very last. There are two main categories of test, where the ICs are
in wafer form and as packaged devices. Two tests known as laser and multiprobe occur
while the devices remain in wafer form. At final test, the packaged units undergo pre
bum-in, bum-in, and post bum-in. Post bum-in includes two tests at different
temperatures and a quality assurance test [7]. The testflowfor an IC is shown in Figure
2.1.
At laser test, only a few layers of metal are present on the wafer. Die are tested
with SRAM tests, continuity, and shorts tests and are binned accordingly. SRAM is
static random access memory, and includes a large number of bits that can either function
12
properly or fail. Good die are labeled for continued processing, as are repairable die.
Repairable die include those die that contain failing bits of memory and can be repaired
by a procedure known as laser repair [7]. The test that brings interest for this project is
done before repair. So only those die that are repairable or good are considered.
Packaged die-level
tests:
Pre bum-in
Bum-in
Post bum-in
I
Figure 2.1. Test Flow for IC.
After mending repairable die, the wafers are completed with all layers of
metallization and only have to be sawed and packaged before they undergo final test. At
this point, all good die on the wafer including those die that were able to be repaired are
prepared for multiprobe. A wafer is placed in a holder under a microscope and aligned
for testing by a multiple-point probe, or multiprobe. The prober makes contact with the
13
die by way of various pads on the surface. The electrical properties ofthe device are now
observed by a series of tests. The tests are done automatically and take somewhere
between milliseconds to several seconds depending on the size and complexity ofthe
device. The results takenfromthese tests are then compared with information stored in
the computer that is based on the specification ofthe device. The computer can then
"remember" whether the chip passes or fails each particular test. A failing device would
be a unit that falls below the specifications determined by the computer. The probe steps
to the next device and completes testing in the same manner for the entire wafer. The
wafer is then removedfromthe multiprobe tester and the individual devices are sawn
apart along the scribe lines ofthe wafer, separating the units. Each passing die is then
picked up and prepared for packaging, while the failing die are scrapped, or discarded.
Information gainedfromtesting regarding each die is stored for further analysis of
failures and passing devices. This stored information can be used for failure analysis and
possible process changes as needed [13].
After multiprobe, the devices are packaged and prepared for final test according
to the specification for the type of application. Final test log points are intact to
guarantee that the performance ofthe device did not shift during the packaging process
[1]. Final test includes pre bum-in, bum-in, and post bum-in [7].
During pre bum-in, all units are tested as to continuity and functionality. This is
thefirsttime for the devices to see a major temperature increase as well as the first
exposure to certain test pattems. The devices are voltage-stressed as well to ensure
stability. Semiconductor devices are extremely sensitive to temperature and voltage
14
changes, and the process must be robust enough in order to witiistand varied temperature
and voltage operation in the end application [7].
Good devicesfrompre bum-in are sent to bum-in. Bum-in is where a group of
devices are subjected to high temperatures and voltage stress for a predetermined amount
of time. The time is determined by an early failure rate (EFR) experiment done every
time a new design or process revision takes place. During an EFR, devices are loaded
into bum-in ovens and taken out intermittently and tested. The amount of time necessary
for failures to cease to occur is considered to be the bum-in time.
Semiconductor devices can be in any one of three stages at different phases of
their lives. These phases are the initial failure period, the random failure period, and the
wear-out failure period, as shown in Figure 2.2. Bum-in is intended to speed up the
initial failure period, or infant mortality, so that failures cease to exist and will not be sent
to the customer [7]. This is based on the fact that failures that occur after shipping are
often the result of processing defects that degrade and eventually fail due to temperature
or voltage stress, or a combination ofthe two. Bum-in is in place in an attempt to catch
these early failures before they are shipped. The stress put on a device causes any defect
to accelerate so it can be found during a post-bum-in test. This measure will reduce the
time to market, increase profit, and reduce customer retums [8].
After bum-in, there should be no more failing devices. This is based on the idea
that after the initial failure period, there should only be random failing devices where the
failure rate curveflattensout. Any failures at this point are considered reliabihty issues
15
and a concem for the process or design ofthe device, since the devices' infant mortality
period has already been simulated through bum-in.
m
Random Failure
Penod
-^^—Wear-out failure
period
Failure Rate
>
Time(t)
Bathtub Curve (Failure Rate Cunre)
Figure 2.2. Bathtub Curve.
After the random failure period, the wear-out failure period occurs. During this
time, semiconductor devices see the end of their Ufetime and eventually wear out. This is
due to continued use over time with varying temperature changes [7]. According to
recent reliability engineering research, however, failures due to wear-out can possibly be
eliminated over a product's anticipated lifetime. So the majority of failures occur during
early life, or within the infant mortality phase [3,4]. The lifetime of a semiconductor
device can be related to the lifetime of humans. There is an infant mortality rate, where
16
babies are more delicate and more susceptible to sudden death or disease. During one's
lifetime there is a lower chance of mortality due to more consistent health and less
fi-agility. At the end of one's lifetime, a human will grow old and once again be weaker
and more susceptible to disease and deterioration.
After bum-in, the ICs are tested to make sure all the failures have occurred and
the devices aretiiilyintiierandom failure rate period. This is done at post bum-in, at
both high and low temperatures in order to investigate temperature stability. Any failures
here indicate a reliability issue since they have already gone through a predetermined
amount of bum-in which should signal they are out ofthe infant mortality phase. At
post-bum-in, devices are also speed sorted, where units are grouped according to their
performance speed. This is done to ensure devices are sorted within the correct speed
range as specified by the customer. Then, a sample of material goes through a quality
assurance test under relaxed conditions. This is to make sure the product has maintained
quality throughout the test process and to assure the customer is receiving quality parts in
line with specification [7].
Components of Testing
Eachflowof testing has certain components necessary in order to get products
tested efficiently while saving time and maximizing throughput. The components
consisted in an ATE tester are the workstation, the mainframe, and the test head. These
components will be described briefly [1].
17
The test head contains the most sensitive elecfronics. It contains the circuits that
require the closest proximity to the device under test, or DUT. The test head contains the
device interface board, or DIB, which forms tiie electiical interface between tiie
automatic test equipment, ATE, and the DUT. The DIB is custom to the particular
device, and provides a temporary socketed electrical cormection necessary in order to test
the part [1].
The workstation is the user interface to the tester. From here, the test engineer
can debug test programs using the software provided from the ATE vendor. The
workstation is very user friendly, and production persormel can contiol day to day
operation ofthe tester as it tests devices under mass production [1].
The last component of testing is the mainframe. The mainframe contains the
power supplies, measurement instruments, and one or more computers that control the
instruments as the test program is executed. The mainframe may contain a manipulator
to position the test head properly in order to calibrate its settings. It may also contain a
refrigeration unit which will provide cooled liquid in order to regulate the temperature of
the test head elecfronics [1].
Testing Global and Local Defects
Current test methods for ICs can be divided into two main categories, to detect
global or local defects. Several different stmctures are used in order to detect these types
of defect. These can include process confrol monitors (PCMs), parameter monitoring.
18
and ring oscillators to test global defects, and in-line parametric monitoring and gateoxide monitors to detect local defects.
Two methods to detect global defects utilize process confrol monitors, or PCMs,
and parameter monitoring. PCMs are specifically designed test modulestiiatare used to
measure low-level parameters. They consist of very basic electrical stmctures such as
single transistors, single lines of conducting material, and chains of via contacts. These
stmctures can be placed on the chip or within the scribe lines, which are located between
the chips where they are sawed apart. The process quality can be checked at some stages
throughout the process by carrying out in-line measurements on these devices. These
devices can contain a set of pattems that are representative ofthe stractures in the actual
product and can be used to emulate the performance ofthe circuit [5].
Parameter monitoring is used tofranslatelow level parameters into higher level
behavior that can be monitored throughout the process. A performance related test of a
parameter monitor design can show the presence of some parametric values. Most global
defects cause a parametric failure that can be detected in this way. The ring oscillator is a
type of circuit that can be used for this type of monitoring [5]. A ring oscillator consists
of an odd number of invertors placed in series whose output is connected back through
the input. A simple three-stage ring oscillator is shown in Figure 2.3. The resulting
oscillationfrequencyfromthe ring oscillator depends on the parametric characteristics of
all the components in the circuit, which makes it a good candidate for emulating circuit
performance [5].
19
Figure 2.3. Three-Ring Oscillator.
Local defects can be measured with a variety of methods. Local defects can
include particles on the top of wafers that lead to problems with processing and electrical
behavior of a product. Local defects can also be scratches or other inconsistencies which
affect a portion of a wafer. Defects in the gate oxide and interconnect layers form the
vast majority of all defects. These defects can be assessed by use of in-line and gateoxide monitoring [5].
In-line monitoring serves as one ofthe most important techniques in obtaining
high production yield and good product quality. During various stages ofthe process,
after known sensitive and critical process steps are taken, the outcome is inspected. Two
techniques used for this inspection are surfscan and image evaluation. Surfscan is a
technique in which a bundle of light is applied and the reflections are evaluated in order
to determine the number of particles on the surface of a wafer. This technique is mainly
used for inspection after layer deposition equipment is used. While surfscan is ideal for
unpattemed wafers, image evaluation uses manual or automated image inspection
systems in order to check the occurrence of local defects on pattemed wafers. By
20
applying this method at critical points in the process, the current status of part ofthe
processing line can be monitored effectively. For example,tiiistechnique would be
usefiil after polysilicon crystalline layers are deposited and after deposition of each metal
layer [5].
Gate oxide monitors are used in order to investigate the gate oxide integrity and
performance. The formation ofthe gate oxide layer is a very critical and sensitive step,
and is susceptible to contamination and process disturbances. The thickness of this layer
is the smallest dimension in the complete process for CMOS technologies. If shorts exist
between the chaimel and the gate ofthefransistorthrough the gate oxide, parametric and
functional faults may result. Simple test stractures which form the gate oxide monitors
can be used to detect contamination problems of this sort [5].
Test Considerations
In order to fabricate, package, test, and ship parts, and collect revenuefromthe
whole process, costs must be observed in each area. One way to cut down on costs and
increase profitability is to observe test economics. Profitability can be defined as the
difference between the revenues generated by a company's products and the costs
associated with developing, manufacturing, and selling them. One way to increase
profitability is to decrease thetimeto market. A delay in the time to market will usually
result in a substantially lower profit margin over the product's shortened life span. A
delay may also resuh in lost business if a competitor's solution is designed into the
customer's system [1].
21
Anotiier way to monitor costs is to develop an effective yield strategy. Yield is
defined as the ratio of good chips per wafer to the total number of chips per wafer. A
good chip is one that has passed all the parametric and functional tests that are specified
for a product [2].
Test reduction is a very important area to consider when looking to reduce cost.
Testing of devices cannot be eliminated completely as quality must be assessed before
shipment to the customer. However, it can be noted that once a device stabilizes into
production, test reduction methods can be experimented with and possibly used to reduce
test. It would be desirable to implement reliable and efficient testing, while utilizing a
method that reduces or eliminates certain aspects of testing. The next chapter will
explore the possibility of reducing bum-in, a very expensive part of final test.
22
CHAPTER III
DEVELOPMENT OF METHODS FOR NEIGHBORHOOD ANALYSIS
Bum-In Reduction
Test is a necessary measure in order to reduce customer retums, increase revenue,
and to ensure that products shipped to the customers meet specification. Reduction of
test is desired, however, in order to maximize profit by decreasing cost and time to
market. Bum-in is an effective tool in screening out devices that have low reliability. It
is an area that is sought-after to decrease or eliminate, however, because it is expensive
due to the high cost of equipment and labor required to achieve results. Time is another
factor, because time to market is delayed when all devices require bum-in. Bum-in is
also a destractive test, where failing devices represent revenues that have been lost [8].
One approach currently used in the semiconductor industry to reduce bum-in is to
use an early failure rate, or EFR. When a new device is ramping to be released into
production, has a design revision, or has a fabrication process revision, a sample goes
through an EFR. This involves defining a sample of devices, loading them into bum-in
and buming them in for a maximum amount of time at a specified voltage. The
maximum time and voltage settings are determined by the specification ofthe device.
During the maximum amount of time, the devices are unloaded and tested at high and
low temperatures intermittently throughout the bum-in process, possibly three or four
times. During the high and low temperature testing, there will probably be some failing
devices. Semiconductor ICs are sensitive to the high temperatures and increased voltage
23
settings ofthe bum-in ovens, and should reach infant mortality quicker using this
method. The devices will reach a point where they are in the random failure phase and
failures should decrease and level out. The time it takes to reach this point is recorded
and used for bum-in once the devices reach production. Products sold to the customer
should be in the random failure phase at this point, so there should be no additional infant
mortalities. EFR studies are used to determine bum-in for ramping devices, and devices
that are in production. This can be a good choice for bum-in reduction as products
mature, due to the resolution of design and process marginalities and problems. Also, as
test program changes are made and devices are debugged, products tend to become better
quality and have higher yield, thus decreasing infant mortality. As EFR studies continue,
bum-in time may see a decrease [7].
Industry Evaluations and Methods for Bum-In Reduction
Other methods have been implemented in order to reduce or eliminate bum-in. In
1999, Intel investigated the results of multiple correlations between reliability and yield
on a die level basis. This work utilized a microprocessor with 0.25 \im technology and a
five metal layer CMOS logic process. A one million unit sample size was used. It was
found that reliability defect density was proportional to yield defect density [10].
In 2001, Intel explored the optimal methods of measuring sort defect density at
the die level. This study used 80 million units with 0.18 |im technology and a 6-layer
CMOS logic process. Using unit level predicted yield (ULPY), the company found a
strong correlation to bum-in failures [9].
24
A master's thesis written by K. Black looked at die sorting in conjunction with
parametrics in order to achieve lower bum-in. The parametrics used were I-Drive, gate
oxide integrity (GOI), and IDDQ characteristics. If the gate width-to-length ratio
decreases, the I-drive measurement decreases as does the speed ofthe device. This
indicates an unstable fabrication process. Low GOI measurements indicate high oxide
breakdown time when a high voltage is applied through the gate oxide. This indicates
wafer defects such as oxide impurities. Finally, high IDDQ generally means a short exists
in the interior ofthe device. These measurements are taken by applying a voltage and
measuring the subsequent current. Since parametrics can be used to discover unstable
devices, they can be used in conjunction with die level sorting based on neighboring die
to reduce bum-in. A die level sorting algorithm was developed for such devices [6].
Neighborhood Analysis Methods
The application of die level sorting utilizing neighboring diefromthe wafer level
is desirable on wafers with large die. A microprocessor with a larger area than studied
previously was used in this study to investigate the possibility of die level sortmg based
on the yield of neighboring die.
Die level sorting is where only certain die are scrapped and good die are
processed based on their functionality at wafer level tests. This is in confrast to wafer
level sorting, where poor yielding wafers are scrapped due to a majority of zonal failures.
Wafers are divided into a set number of zones and scrapping is based on the yield of
these zones. If a large number of zones have poor yield, the entire wafer is scrapped.
25
High yielding wafers are tested and bumed-in. Scrapping entire wafers based on zones
can be costly since good die are on those wafers are thrown away as well. Die level
sorting can save money when good die are processed with confidence that they will retain
reliability, even if there are zones on the wafer they came from which had poor yield. It
may be possible to determinefinaltest yield on a larger device based on neighboring die
yield at an earlier test point, while the die are still in wafer form. It may also be feasible
to downgrade devices based on neighboring die yield and require more bum-in than
higher yielding die.
Byfrackingdata throughout the fabrication and testing process, possible frends
can be found in order to support die level sorting. Local yield can be investigated on a
wafer and has been shown to predict passing and failing regions on a wafer [8]. By
observing local yield on a wafer, individual die can be considered for scrapping,
continued processing, as well as reduced bum-in. It would be desirable to explore a
method that will support a scrapping specification and/or bum-in reduction based on
confidence that such die would fail or passfinaltest.
The purpose of this project was to be able to predict final test fallout of fransistor
related failures by understanding if any correlation exists between neighboring die and
reference die at laser and multiprobe. It was also necessary to observe all failures as well
rather than primarilyfransistorrelated failures. Empirical data was used to determine if
continued processing potentially good die and scrapping poor die at the wafer level was
in order. Also, die with a high predicted yield was considered for possible decreased
bum-in.
26
Several steps were taken in order to determine if these goals were possible. To
drive this experiment, a large, seven metal layer microprocessor device was used.
Twenty lots comprised of twenty-four wafers each were used. All failuresfromfinaltest
were isolated as well as the transistor related failures of interest. These transistor related
failures were L2Cache (level two cache), high Vdd SRAM (static random access
memory) failures, and high Vdd functional failures. L2Cache failures are those devices
that have failing bits of memory caused byfransistorfabrication issues. High Vdd
SRAM failures are memory failures that fallout due to high Vdd. A die that falls out
ftmctionally due to high Vdd is biimed out as a high Vdd functional failure. These
failures represent a design or process marginality of interest [7].
All types of failuresfromfinaltest were included as another mode of data
collection. Including all types of failuresfromfinaltest compared with all good units
helps to prevent biasfrombeing infroduced. This bias is due to the nature ofthe test
programs. In a test program, the shortest and less complicated tests are at the beginning,
and the longer ones are located at the end ofthe program. This is done because once a
part fails, the test program is exited for that part. The tester then moves on to the next
part, and starts over. This is done to expedite the testing process so that if a part fails
early on, it has only gone through short tests rather than longer ones. So, if a neighbor of
a transistor related failure falls out ofthe test program at multiprobe for instance, at the
beginning, there is no way to tell whether it would have passed or failed the fransistor
related failure bins of interest. It would be counted as a failure, but it may have actually
passed the appropriate test if tested with it. Therefore, the data would be biased at that
27
point. For this project, both transistor related failures and all failures were observed to
understand their comparison.
After the failing reference die were located from fmal test, data was collected
from their neighboring die at laser and multiprobe. Data from neighboring die at laser
and multiprobe were also collected from all good units. Good units are those that pass all
the way through fmal test. Good unit data was collected in order to compare to the
failing unit data.
Neighboring die are those that in theory affect a reference die's yield based on
proximity to the reference die and/or their relationship with the reference die. Because of
their proximity or relationship to the reference die, neighbors should have the same type
of process step performed on them, and should be very similar in parametric and
electrical characteristics. The neighboring die of interest are the nearest eight
surrounding neighbors, the nearest twenty-four surrounding neighbors, the radial
neighbors, and the same X, Y coordinate on different wafers.
The nearest eight neighbors are those die that immediately surround the reference
die, forming a square on the wafer. The same immediate neighbors are used for the
twenty-four nearest neighbors, however the next die further out forming another box are
used as well. Radial neighbors are those that lie in the same circular distance from the
center ofthe wafer. A reference die's neighbors in this instance are about the same
distance from the center as the reference die, where the reference die is included in its
particular circle. For the X, Y neighbors, all die that are in the same X, Y coordinate on
all the remaining twenty-three wafers in the same lot are considered neighbors. The same
28
yield number is used for each X, Y coordinate for each reference die in this group. These
neighbor definitions can be seen in Figure 3.1. Unit level predicted yield (ULPY) was
also used to determine if a better correlation can be found, as derived by Intel. This
method combines XY yield and local yield in order to form a single die level predictor
that is possibly more powerful than either one used individually [9]. Different
combinations of yield methods were explored in order tofindthe most efficient predictor.
Radial
Die about the same distance from
the center as the reference die are its
neighbors
Nearest Neighbors
Immediately surrounding die
-8 surrounding die
-24 surrounding die
Ww
i1^^ •_
11
1
•I
•
XY Yield
24 w afers, in same lot, neigh t)ors
are 23 oth er wafers with the 5>ame
[, Y coordinate
K
Figure 3.1. Neighbor Definitions.
In order to determine whether the good units had a better chance of passing at
final test compared with the failing units, a yield number needed to be derived based on
neighboring die for each reference die. Once yield numbers were derived for each
29
reference die, bad and good, based on all the neighboring methods, these numbers could
be compared with each other. The goal was to find a promisingfrendthat can support a
new pass/scrap specification, and/or bum-in reduction for those die that have high
predicted yields. A promisingfrendwould be one where good die have higher yield
numbers than failing diefromfinal test. The methodology is outlined in Figure 3.2.
20 lots ~ 60,000 units
Good Units from Final Test
Failtires from Final Test
i
Assign predicted yield number 0-100
based on neighboring die yield at
Laser, MP
Assign predicted yield number 0-100
based on neighboring die yield at
Laser, MP
Compare failing units with
good units
Figure 3.2. Methodology for Neighborhood Analysis.
Deriving Yield Numbers
In order to make the tracking of data possible, there is a unique die identification
number associated witii each device. This die ID is located on-chip witiiin each die. This
number contams the die lot number, the wafer number, and its X, Y coordinates. The die
ID number is located witiiin on-chip fuses that are blown at multiprobe with the unique
30
number. This makes wafer mapping possible, and is a good way tofrackfailures and
good units with their bins throughout the fabrication and testing processes. This also
enablesfrackingofall die information giving the ability to predict yield based on
neighbors [7].
Using each device's unique die ID, data was collected from laser and multiprobe
on all neighboring die for failing devices and good unitsfromfinaltest. Yield numbers
were derived for each die in the project. Different methods were used to derive yield
numbers for each neighboring definition, including nearest eight and twenty-four
neighbors, radial neighbors, XY yield, and ULPY.
Tofindyield numbers for the nearest eight neighbors, the distancefromthe
reference die needed to be included in the derivation. The closer the neighbor is, the
more effect it should have on the reference die. The further away it is, the less Ukely it
will have as sfrong an effect as a die that is closer. Distancefromthe reference die was
calculated using measured distance for X and Y directions. Figure 3.3 indicates X and Y
direction measurements.
The Pjihagorean Theorem was used tofindthe distancefromthe reference to
diagonal die using measurementsfromX and Y distances. Figure 3.4 shows the
orientation of neighboring die to the reference die, including references to neighbors in
X, Y, and Z directions.
31
Height Y
^ r Die
-t
•
-p.
Width X
Figure 3.3. Height and Width Measurements for a Die.
Figure 3.4. Orientation of Neighboring Die to Reference Die.
Equation 3.1 shows how the Pythagorean Theorem was used to calculate distance
Z, where X and Y were the respective measured distances. The distancefromthe
reference die to neighboring die in the X and Y directions is simply the width and height
ofthe reference die itself, respectively. This is due to the fact that the measurement from
32
tile center ofthe reference die to the center ofthe neighboring die is the same as the width
and height ofthe reference die.
Distance Z = Vz^+F^
(3.1)
Now that distances have been identified in all three directionsfromthe reference
die to its nearest eight neighbors, an equation needed to be found and used in order to
calculate predicted yield numbers. This equation was used tofindthe weighted yield of
each neighbor. Using the yield ofall eight neighbors and including the distancefromthe
reference die, an appropriate weighted yield number could be calculated for each
neighbor. Equation 3.2 shows how weighted yield numbers were created.
weighty =
yield
^
^
*\00
^
^
' ' • ' '
In Equation 3.2, n represents the individual neighbors, numbered 1 through 8.
The yield in the numerator is the yield ofthe neighboring diefromlaser or multiprobe. If
the neighbor was good at laser or multiprobe, it would receive a value of 1 for yield. If it
was bad, it would receive a value of 0. This yield number is divided by the distance that
neighbor is from the reference die, thus accountmg for its impact on the reference die due
33
to the proximity to it. The distance is presented using the Pythagorean Theorem, which
will calculate the distance for each neighbor from the reference die, no matter which
neighbor it is. This number is now divided by tiie sum ofall eight neighbors when they
are good (yields = 1) divided by their respective distances from the reference die. This
denominator represents the total yield possible. In other words, the denominator
represents perfect yield for all neighbors, so yields of 1 are divided by all eight distances
and summed. When each neighbor is divided by it one at a time, it will give the portion
ofthe whole that a particular neighbor contributes toward the total predicted yield for the
reference die. Finally, this weighted quantity is multiplied by 100, in order to have a
predicted yield percentage between 0 and 100. The weighted yields from each neighbor
are added together in order to get the predicted yield for the reference die. This means
each reference has a predicted yield value that is the summation of two die in the X
dfrection, two die in the Y direction, and four die in the Z direction. When all eight
neighbors are good, the total weighted yield will be 100. This is seen in Equation 3.3. In
theory, a die with a predicted yield of 100 has a 100% probability of passing.
Predicted Yield Number = 2{X) + 2(7) + 4(Z) = 100
(3.3)
In order to better understand the predicted yield derivation for nearest neighbors,
an example using normalized quantities is used. For this purpose, the distance from the
reference die to the X neighbors is 1, and the distance to the Y neighbors is 1. Using the
Pythagorean Theorem, the distance to neighboring die in the Z direction is V2 , as shown
34
in Equation 3.4. Orientation and values for neighboring die for this normalized example
is shown in Figure 3.5.
Distance to neighboring die in Z direction = Vl^ +1^ = V2
(3.4)
Distance in Z
direction: V2
Figure 3.5. Orientation and Values for Normalized Example.
The normalized values for distances for the eight neighbors are used in Equation
3.2 in order tofindthe weighted yield contribution each neighbor gives to the reference
die. The equations used to find the weighted yield values for die in the X, Y, and Z
directions are shown in Equations 3.5, 3.6, and 3.7, respectively.
* 100 = ^ ^ = 14.645
6.828
X Direction =
4
2 2
+_ +_
y/2
1 1
35
(3.5)
y Direction = — ^ J
* 100 = ^ ^ = 14.645
^O^TF
6.828
~4 2 2
V2 1 I
1
70 355
* 100 = ^^^^^^^ = 10.355
Z Du-ection = —,
^1^77
~^
V2
(3.6)
(3.7)
6.828
2 2
1 1
The weighted numbers are added to the total predicted yield value for a reference
die when a neighbor is good. If a neighbor is bad, it will contribute a value of 0. The
weighted yield values for each neighbor are added up so that there are two neighbor
values in the X dfrection, two neighbor values in the Y dfrection, and four neighbor
values in the Z direction, as seen in Equation 3.8. If all eight neighbors are good in this
case, the predicted yield will be 100, as seen in Equation 3.9.
X + X + Y + Y + Z + Z + Z + Z= Weighted Yield Value
(3.8)
14.465 + 14.465 +14.465 +14.465 + 10.355 +10.355 + 10.355 + 10.355 = 100 (3.9)
The weighted numbers derived for each neighbor are used throughout the rest of
the 8 nearest neighbor analysis, rather than deriving them individually every time. For
example, if neighbor X is good, it will contribute a value of 14.645 to the summation for
36
tile predicted yield for the reference die. If neighbor X is bad, it will be summed as a 0 to
the predicted yield. This method is used for analysis involving laser and multiprobe
neighbors.
Determining the yield numbers for the twenty-four surrounding neighbors was
done in a similar way, except that the distance for all twenty-four immediately
surrounding neighbors was calculated as well. The Pythagorean Theorem was used for
this purpose as well. Figure 3.6 depicts a grid in which a reference die has twenty-four
neighbors.
B
E
D
^
Z
A
X
D
z
Ref
E
Z
D
X
A
>.,«.. ^_:. ......
E
•
B
E
D
C
Figure 3.6. Reference Die with Twenty-Four Nearest Neighbors.
The equations used for the nearest eight neighbors could be used once again from
the derivation ofthe nearest eight neighbors' distances. For neighbors of distance A, B,
37
C, D, and E, Equations 3.10 through 3.14 could be used. The Pythagorean Theorem was
used for distances C, D, and E.
A = 2X
(3.10)
B = 2Y
(3.11)
C = 4^FV¥
(3.12)
D = 4Y^~^
(3.13)
E^ylB^+X^
(3.14)
Equation 3.2 used for the calculation ofthe nearest eight neighbors is used again
in order tofindthe weight ofthe nearest twenty-four neighbors. Once again, if all
neighbors are good, the predicted yield would add up to 100, as shown in Equation 3.15.
2X-^2Y + AZ + 2A + 2B + 4C-\-4D + 4E = l00
(3.15)
For the next neighbor definition, XY yield, a simple arithmetic mean is used.
Each XY coordinate has twenty-three neighbors on other wafers in the same lot at that
38
same coordinate. The yield number calculated is used for each XY position within each
lot. If a neighboring die is good, it receives a yield number of 100, where if it is bad it
receives a 0 when calculating the mean. The sum of these yields are divided by twentythree, the total number of neighboring die at that XY coordinate in a lot, as seen in
Equation 3.16.
23
Y^Yield,
23
(3.16)
In order to find predicted yield numbers for radial analysis, first the distance from
the center of a wafer to each die on the wafer needed to be found. After identifying the
center ofthe wafer by direct measurement, the Pythagorean Theorem was used to find the
distances to each die on the wafer. Then eight groups were identified that had similar
distances from the center ofthe wafer. These eight groups form circles from the center.
Any die that lies within a cfrcle is considered a neighbor to another die on the cfrcle.
Figure 3.7 shows the distances calculated from the center and the group in which each
radial circle lies.
39
Groups
10
11
12
13
Figure 3.7. Radial Distances (mm) and Groups.
The yield number for radial analysis was calculated depending on which group a
reference die belongs to. The yields ofall neighbors were determined, summed, and
divided by the number of members in each group. This is a simple averaging method
where X is the number of neighbors in a group, as shown in Equation 3.17.
X
±Yield,
1
(=0
X
(3.17)
For Intel's unit level predicted yield, or ULPY, a geometric mean was used. Two
quantities were used, XY yield and local yield, or nearest neighbor yield. These
quantities were combined in order to create a more powerful predictor than either one can
40
offer individually. Ratiiertiianusing an aritiimetic mean which would not sufificientiy
penalize die with very low predicted yields, the geometiic mean was used. A number
between 0 and 100 is calculated which gives a more accurate predicted of whether or not
a die will pass. This is shown in Equation 3.18.
ULPY = ylLocalYield*XYYield
(3.18)
Intel was able to show through empfrical studies that weaker die were predicted to
fail successfully using ULPY rather than usingtiiearithmetic mean [9]. For this project,
both eight and twenty-four nearest neighbors were used as local yield along with XY
yield.
An important aspect to consider when looking at neighborhood analysis is the
consideration of edge die. Edge die are die that are located on the edge of a wafer and do
not have the full eight or twenty-four surrounding neighbors. Figure 3.8 shows edge die
missing some of their nearest neighbors.
One way to approach this problem is to consider that edge die are more likely to
fallout and that the absence of die surrounding them should not increase thefr predicted
yield. In this regard, the nussing die should be omitted and effectively have a yield of 0
when making predicted yield calculations. Another way to approach the issue is to
understand that missing die caimot increase the predicted yield, but are not there to
impact the die in one way or another. Therefore, iffiveout of eight surrounding die are
missingfroman edge reference die for example, only thefivedie should be included in
41
tile yield number derivation. This impacted the nearest eight and nearest twenty-four
neighboring defmitions. Fortiiisproject, botii approaches to edge die were considered
and compared.
/ /'
/
/
Ref
/
\
1
1
*
Ref
'"'
X
Figure 3.8. Edge Die Consideration.
42
!
*
CHAPTER IV
NEIGHBORHOOD ANALYSIS RESULTS
Infroduction
This chapter will concentrate on the results ofthe neighborhood analysis
technique as described in the previous chapter. The neighbor definitions used were
developed by Intel, including local yield (eight and twenty-four nearest neighbors), radial
yield, XY yield, and ULPY (XY yield along with eight and twenty-four nearest
neighbors). The charts with resulting data to be shown are indicated in Figure 4.1.
All Failures F rom Final Test
Neighbors From Laser
Missing = 0
Missing Not Used
Neighbors From Multiprobe
Missing = 0
Missing Not Used
8NNs
^^[^^m^,
24NNS
Radial
N/A
N/A
XY
N/A
N/A
ULPY - 8
N/A
N/A
ULPY - 24
N/A
N/A
Figure 4.1. Charts with Resulting Data Shown in Chapter IV.
In Figure 4.1, all failures refer to allfinaltest failures from every log point in the
test flow. This includes pre bum-in, post bum-in, and quality assurance. "Missing = 0"
and "missing not used" refer to missing edge die. Charts that show datafromnearest
43
neighbor calculations show both approaches to freating edge die. The fnst method is
calculating predicted yields when missing edge die are counted as 0 (missing = 0). This
calculation did not apply to radial analysis, XY yield, ULPY 8, and ULPY 24 because
these calculations involved die that remain on the wafer and not missing edge die. The
second way to freat missing edge die is to omit them from calculations. When a chart
says "missing not used," this means that missing edge die were not counted toward the
predicted yield number. This approach was used for every method of calculating
predicted yield, since every method involved die on the wafer. Nearest neighbor
calculations were the only methods that made the distinction between ways to freat
missing die.
The predicted yield numbers for each reference die from final test for each
method were calculated. The reference die used are good units and failing units from
final test, from all twenty lots, which was about 14,000 units. There were about 3,000
bad die and about 11,000 good die. All die from the 20 lots, about 63,000 units, were
used as wafer level neighbors in calculating predicted yields from laser and multiprobe.
The predicted yields for each reference die were grouped into tens. This means
predicted yields from 0 to 10,10 to 20, and 20 to 30, etc., were grouped together. The
predicted yields comprised ten different groups often. For each prediction method, for
laser and multiprobe, the percentage of good die and failing die in each group often were
calculated. Good die and failing die were compared at each group often in this way.
The total number of good die were added to the total number of failing die at each group.
Then, the number of good die were divided by the total number to determine the
44
percentage of good die in that particular predicted yield group. The failing die were then
divided by the total in order to determine the percentage of failing die. This determined
the percentage of good die and failing die at each predicted yield group, showing whether
good or failing die dominated the group. In other words, the two percentages (good and
failing die) at each predicted yield group often added up to 100%.
Thefrendof data worth noting is whether there are higher percentages of good or
failing die at each predicted yield group. A goodfrendto see that would support the
goals of this project would be a substantial amount of failing die at low predicted yields,
with no good die at those points. This would support throwing away die with low
predicted yields before they are processed further and packaged. Another good frend
would be to see a high amount of good die with higher predicted yields, with no failing
die present. This would support reduced bum-in of die with high predicted yields, based
on the confidence that there would be no failing die present based on this data.
Neighborhood Analysis for Local Yield
Figure 4.2 shows the results for eight nearest neighbors at laser, where missing
edge die were counted as zero. In this case, there were more die at predicted yields of 0 100, however there were also good die present at the same predicted yields. Even though
there were more failing die, this does not mean all die with predicted yields between 0
and 100 could be thrown away, or scrapped, because there were too many good die that
would also be thrown away. The same is tme for higher predicted yields. Even though
there were more good die at high predicted yields, there remained failing die with the
45
same yield prediction. Bum-in caimot be reduced for these predicted yields due to the
fact that failing die would receive the same reduced bum-in based on this data. The
failures may not be identified after bum-in and failing units could possibly be shipped.
0-10
10-20
20-30
30-40
40-50
50-60
60-70
70-80
80-90
90-100
Predicted Yield
Figure 4.2. Percent of Failures versus Good out of Total Die at Each Predicted
Yield Group, 8 Nearest Neighbors (Missing Edge Die = 0) at Laser.
Figure 4.3 shows results for eight nearest neighbors at multiprobe, where missing
edge die were counted as zero. So, if there were eight nearest neighbors and the
reference die from final test was on the edge oftiiewafer, the missing die at the edge
were counted as zero. The equation divided the neighbor yields by the total possible
46
yield, where the missing edge die were counted as zero, and these were summed. This
would downgrade the edge die accordingly, as edge die are usually less reliable in
semiconductor devices.
0-10
10-20
20-30
30-40
40-50
50-60
60-70
80-90
90-100
Predicted Yield
Figure 4.3. Percent of Failures versus Good out of Total Die at Each Predicted
Yield Group, 8 Nearest Neighbors (Missing Edge Die = 0) at Muhiprobe.
The following charts depict the data taken from the same nearest eight neighbors,
only the missing edge die were omitted rather than counted as zero. Omitting missing die
means if there were only five intact neighbors on a wafer when using the eight nearest
neighbor method, those five neighbors only were used in the calculation. This was to
show if there was any difference in a natural downgrade of edge die, where the missing
47
die are neglected, and a more dfrect downgrade where the missing die are counted as
zero. The data taken using eight nearest neighbors from laser is shown in Figure 4.4.
0-10
10-20
20-30
30-40
40-50
5060
60-70
7040
80-90
90-100
Predicted Yield
Figure 4.4. Percent of Failures versus Good out of Total Die at Each Predicted
Yield Group, 8 Nearest Neighbors (Missing Edge Die Omitted) at Laser.
The data observed using neighborsfrommultiprobe at wafer level can be seen in
Figure 4.5. Again, the eight nearest neighbors were considered neighbors, and the
missing edge die were omittedfromthe calculation. It should be noted that there were
more die vnth predicted values of 0 in multiprobe calculations. This is not apparent from
these charts, as only the percentage of die at each predicted yield group was considered,
and not actual raw numbers. More predicted values of 0 at multiprobe are due to the fact
48
tiiat failing die from laser were counted as 0 at multiprobe as well. This is because only
tiie good and repairable diefromlaser are tested at multiprobe, and all other failing die
are counted as 0. There would be more failing die on the wafer, and a higher chance of
having no passing neighbors.
S
60
o
40
a.
30
0-10
10-20
20-30
40-50
50-60
60-70
70«»
80-90
90-100
Predicted Yield
Figure 4.5. Percent of Failures versus Good out of Total Die at Each Predicted
Yield Group, 8 Nearest Neighbors (Missing Edge Die Omitted) at Multiprobe.
The twenty-four nearest neighbors were analyzed in the same way as the eight
nearest neighbors, where both missing edge die considerations were calculated. First, the
data for the twenty-four nearest neighborsfromlaser is shown in Figure 4.6, where edge
49
die were counted as zero. Failing die had a higher percentage of units at predicted yields
between 0 and 20. Good die dominated the percentages at all other predicted yields.
0-10
10-20
2^30
30-40
40-50
5060
60-70
70W
80-90
90-100
Predicted Yield
Figure 4.6. Percent of Failures versus Good out of Total Die at Each Predicted
Yield Group, 24 Nearest Neighbors (Missing Edge Die = 0) at Laser.
There was no data at predicted yields between 90 and 100 for nearest twenty-four
neighbors at multiprobe, as shown in Figure 4.7. This is similar to the previous
discussion explaining why there were more predicted yields of 0 for all die when
multiprobe neighbors were considered. There were in general lower predicted yields for
data using multiprobe neighbors in general due to fallout from laser. That is why there
50
were not many higher predicted yield values. Also, good die had a higher percentage of
unitstiianfailing die at all predicted yield groups.
0-10
10-20
20^
3O40
40-50
50-«0
70«)
80-90
90-100
Predicted Yield
Figure 4.7. Percent of Failures versus Good out of Total Die at Each Predicted
Yield Group, 24 Nearest Neighbors (Missing Edge Die = 0) at Multiprobe.
The chart shown in Figure 4.8 shows the nearest twenty-four neighbors at laser
where missing edge die were omitted. The actual number of units with yield numbers
between 0-10 and 10-20 was small, so thefr difference is not as substantial as it seems.
In other words, the fact that there were more good units at predicted yields of 0-10 and
there were less good units at 10-20 does not signify a problem when considering actual
numbers of units. There were more units considered as predicted yield values increase.
51
Between predicted yields of 20 to 100, good die had higher percentages of units than
failing die.
0-10
10-20
20-30
3O40
4050
50-60
Predicted Yield
60-70
70-80
80-90
90-100
Figure 4.8. Percent of Failures versus Good out of Total Die at Each Predicted
Yield Group, 24 Nearest Neighbors (Missing Edge Die Omitted) at Laser.
Figure 4.9 shows the nearest twenty-four neighbors from multiprobe when
missing edge die were omitted. Once again, there were no predicted yields from 90 to
100 for any die. Also, good die had higher predicted yields than failing die at every
predicted yield group often.
52
0-10
10-20
20-30
40-50
50^
60-70
70«)
80-90
90-100
Predicted Yield
Figure 4.9. Percent of Failures versus Good out of Total Die at Each Predicted
Yield Group, 24 Nearest Neighbors (Missing Edge Die Omitted) at Multiprobe.
Neighborhood Analysis Using Averaging Methods
The neighborhood analysis techniques that use averaging methods included XY
yield and radial analysis. Since these methods utilized all die on the wafer, there was no
concem regarding missing edge die. All die involved remained on the wafer in question,
so there was no need tofreatmissing edge die.
Thefirstdata shown is XY yield at laser in Figures 4.10. This method seemed to
chart the best so far, with failing die dominating the predicted yield groups until a
predicted yield of 30. Good die were represented with more die than failing die at
predicted yields from 30 to 100. This data was also good because for good die, the
53
percentage of die at each group increased with increasing predicted yield numbers.
Failing die percentages decreased with mcreased predicted yield numbers. This is what
would be expected for XY yield, since each die in each XY coordinate should be freated
the same in the several different processes used in order to fabricate the chips.
0-10
10-20
20-30
30-40
40-50
50«)
60-70
70*0
80-90
90-100
Predicted Yield
Figure 4.10. Percent of Failures versus Good out of Total Die at Each Predicted
Yield Group, XY Yield at Laser.
XY yield using neighborsfrommultiprobe is shown in Figure 4.11. There were
no die at predicted yieldsfrom80 to 100, since neighbors at multiprobe have seen fallout
previously at laser tests. There was no dominance of failing die at low predicted yields.
Good die had higher percentages of die at each predicted yield group. However,tiiegood
54
die percentages did increase with increasing predicted yields, and failmg die percentages
did decrease with increasmg predicted yields.
0-10
10-20
20-30
3O40
40«)
50-60
60-70
70-80
80-90
90-100
Predicted Yield
Figure 4.11. Percent of Failures versus Good out of Total Die at Each Predicted
Yield Group, XY Yield at Multiprobe.
The data for radial analysis follows, starting with neighborsfromlaser in Figure
4.12. There were the same number of good die and failing die at predicted yieldsfrom010. Good die percentages were higher than failing die percentages from 10 to 100. Once
again, this method seemed to follow the same trend, where good die percentages
increased with increasing predicted yields and failing die percentages decreased with
increasing predicted yields. This would make sense, since the different process steps
55
used to fabricate ICs may have affected die the same distancefix)mthe center ofthe
wafer.
0-10
10-20
2^30
3040
WSO
50-60
60-70
70-80
80-90
90-100
Predicted Yield
Figure 4.12. Percent of Failures versus Good out of Total Die at Each Predicted
Yield Group, Radial at Laser.
Radial neighborsfrommultiprobe are shown in Figure 4.13. There was no
dominance of failing die at any ofthe predicted yield groups.
56
100 -r
0-10
10-20
20-30
30-40
4O50
5060
60-70
70«)
80-90
90-100
Predicted Yield
Figure 4.13. Percent of Failures versus Good out of Total Die at Each Predicted
Yield Group, Radial at Multiprobe.
Neighborhood Analysis Using the Geometric Mean
The geometric mean was used to calculate Intel's proposed method of unit level
predicted yield, ULPY. ULPY is calculated using both eight and twenty-four
neighboring methods along with XY yield. This method neglects missing edge die, due
to the fact that XY yield uses only die present on the wafer, so only die on the wafer was
used in the local yield calculations that are used in ULPY.
The data calculated for ULPY using XY yield and 8 nearest neighbors from laser
is shown in Figure 4.14. Failing die percentages dominated good die at predicted yields
57
of 0-100 and 20-30. Good die percentages were higher at every other predicted yield
group.
0-10
10-20
2O30
30-40
40-50
50-60
Predicted Yield
60-70
7O60
80-90
90-100
Figure 4.14. Percent of Failures versus Good out of Total Die at Each Predicted
Yield Group, ULPY (XY Yield and 8 Nearest Neighbors) at Laser.
Figure 4.15 shows data using ULPY at multiprobe, using XY yield and 8 nearest
neighbors. Since this data used neighbors from multiprobe, there were no die predicted
with yields between 80 and 100.
58
0-10
10-20
20-30
3040
AO-50
50-60
60-70
70«)
80-90
90-100
Predicted Yield
Figure 4.15. Percent of Failures versus Good out of Total Die at Each Predicted
Yield Group, ULPY (XY Yield and 8 Nearest Neighbors) at Multiprobe.
The data for ULPY using twenty-four nearest neighbors and XY yieldfromlaser
is shown in Figure 4.16. This data fell differently than the others, since there was a huge
number of die with predicted yields between 0 and 100. The number of die slowly
started increasing at predicted yields of 10-60, topped out somewhere around 70, and
started to decrease again to 100.
59
II
•"III
0-10
10-20
20-30
30-40
40-50
50«)
60-70
7080
80-90
• Fails
• Good
90-100
Predicted Yield
Figure 4.16. Percent of Failures versus Good out of Total Die at Each Predicted
Yield Group, ULPY (XY Yield and 24 Nearest Neighbors) at Laser.
The data for ULPY using twenty-four surrounding neighbors and XY yield from
multiprobe is shown in Figure 4.17. This data looks like other charts using multiprobe
neighbors, where no die existed at predicted yields between 80 through 100. This is the
only data where there was 100% good die at the highest predicted yields, which was
between 70 and 80. This seemed to support reduced bum-in for those good die with high
predicted yields since no failing die were present. However, the actual number of die at
this predicted yield group was so small and insignificant that reduced bum-in could not
be supported.
60
120
0-10
10-20
2O30
3O40
40-50
50«)
Predicted Yield
60-70
70«)
80-90
90-100
Figure 4.17. Percent of Failures versus Good out of total Die at Each Predicted
Yield Group, ULPY (XY Yield and 24 Nearest Neighbors) at Multiprobe.
Large Die Size Consideration
A possible explanation ofthe results offindinga goodfrendversus concrete bumin reduction based on data might be due to the large die size ofthe device used in this
investigation. A large die has fewer die per wafer as compared to a small die on the same
size wafer. This results m a yield decrease for the large die. In order to compare this
experiment with others done on smaller die, the wafer would have to be X times bigger,
where X istiiescalefromtiiesmall die size totiielarger die size. This would make tiie
die per wafer equal, and would also increase yield for the large die wafers.
61
In order to better understand how yield is affected by die size, defect density
should first be defined. Defect density is the total number of defects divided by the total
area ofthe wafer, for each wafer. For a die with area A and yield Y, a defect density
value D is calculated using Equation 4.1 [11].
D=^^
(4.1)
This equation is usefiil to use when the defect density value and the area are
known in order to understand yield differences with varying die areas. The following
example demonsfrates tiiis effect. Suppose tiiere are two 8-mch wafers, contaming 100
and 1,000 die/wafer, respectively. The total area ofthe wafer is then 50.3 in . The
individual die areas are
50.3/100 = 0.503 in^
(4-2)
and
50.3/1,000 = 0.0503 inl
Suppose tiie defect density is 1 defect / m\ When calculating yield for each
wafer, the form of Equation 4.4 is used, as modified from Equation 4.1.
62
(4-3)
Y = \-DA
(4.4)
After plugging in the values for D and A in Equation 4.4, the yield for the wafer
witii 100 larger die is 49.7%, whiletiieyield fortiiewafer witii 1,000 smaller die is
95.0%. This presents clearly that larger die naturally have lower yield using the same
process on the same size wafer than smaller die.
With lower yield, it will be more difficuh tofinda definite correlation of laser and
multiprobe neighbor yields andfinaltest yield. What has been found is afrend,where
good die have higher predicted yields than bad die, and failing die have a higher
percentage of low predicted yields than good die. Chapter V expands on this conclusion.
63
CHAPTER V
NEIGHBORHOOD ANALYSIS CONCLUSIONS
The ultimate goals of this tiiesis were to find if bum-in could be reduced for die
that had high predicted yields and if further unnecessary processing and packaging could
be eliminated for die that had low predicted yields. It was seen from the several charts
presented in Chapter IV featuring several different methods of calculating predicted
yields based on wafer level neighbors that there was a general frend. This frend indicated
that a higher percentage of good die were likely to have higher predicted yields than bad
die, and bad die were likely to have a higher percentage of failing die at lower predicted
yields. This supported the idea that good die should have a higher amount of good
neighboring die and bad die would have more failing neighbors. Unfortunately, there
was not enough good die with substantially high predicted yields where failing die did
not exist. For example, there was no method that resulted with 15% total die with a
predicted yield of 100 and 0% bad die with a predicted yield of 100. This situation would
support a bum-in specification change, where the predicted die with yield numbers of 100
would undergo reduced bum-in based on the confidence that they would pass final test,
more specifically post bum-in.
In addition, there was no case where failing die had very low predicted yields
where good die were not present with the same low yield prediction. This means there
was no chance of scrapping low yielding die, thus saving money by not processing bad
64
dietiiroughfmal test and scrappingtiiemoncetiieyfail, fri most cases,tiierewere good
and bad die associated with every predicted yield number.
Possible explanations of these results are the fact that a large die has been used.
Larger die have a better chance of having low yielding wafers when compared with
smaller die. Two wafers ofthe same size with the same process and thus the same defect
density with different numbers of die per wafer will have different yields, where the
wafer with larger die will have lower yield. This means finding dramatic results from
neighborhood analysis will not be possible. A smaller die would have more promising
results, with neighbors in smaller proximity, catching more defects on the wafer. A large
die, for example, could have three defects within its entfrety and be considered a failure,
where four small die could have each defect within one die, and have three out of four
failures. This would catch more defects, and cause the results of a neighborhood analysis
to be more efficient and accurate.
Another possible factor to consider is sample size. While other industry
applications have looked at millions of die, this study considered about 60,000. In
addition to large die size, a smaller sample may have hurt the results to where no defmite
bum-in reduction method could be found.
In conclusion, while the method within this thesis has been proven effective on
other devices, it could not be used to reduce bum-in for die witii high predicted yields, or
used to scrap die with low predicted yields. Good and bad die botii followed tiie same
frend witiiin predicted yield numbers, where good die did have higher predicted yields
tiian bad die. However, there were too many bad die that followed the samefrendand
65
had too many at high predicted yields as well. This does not facilitate confidence to
suggest a specification change as to bum-in time. This method may be effective when a
larger sample is used on large die, or when a larger wafer size is attained for this product.
Other methods should be considered along with neighborhood analysis for large
die in the fiiture. Parametric data could be considered along with this method, as this data
has been used tofinddefective process conditions and design marginalities for large die.
Neighborhood analysis, along with parametric data and a larger wafer, shouldfindits
place in bum-in reduction methods in the future for large die, based on the promising
frend found through this thesis.
66
REFERENCES
1. Bums, Mark and Roberts, Gordon W. An hitroduction to Mixed-Signal IC Test
and Measurement. New York, New York: Oxford University Press, 2001.
2. Wolf, S. Microchip Manufacturing. South Beach, Califomia: Lattice Press,
2004.
3. Bamett, Thomas S., Singh, Adit D., Nelson, Victor P. "Extending IntegratedCircuit Yield-Models to Estimate Early-Life Reliability." IEEE Transactions on
Reliability Vol. 52, No. 3, September 2003.
4. Bamett, Thomas S., Singh, Adit D., Nelson, Victor P. "Bum-ln Failures and
Local Region Yield: An Integrated Yield-Reliability Model." Proceedings ofthe
19^ IEEE VLSI Test Symposium p. 0326, March 29 - April 03,2001.
5. Pradhan, Dhiraj K. Integrated Circuit Manufacturability The Art of Process and
Design Integration. Ed. Jose Pineda de Gyvez. New York, New York: Institute
of Electrical and Electronics Engineers, Inc., 1999.
6. Black, Kelley A. "Die Level Sorting of an Integrated Circuit." Master's Thesis,
Texas Tech University, Lubbock, TX, 2000.
7. Information obtained through industry sponsor.
8. Sabade, Sagar S., Walker, Duncan M. "Evaluation of Effectiveness of Median of
Absolute Deviations Outiier Rejection-based IDDQ Testing for Bum-in
Reduction." Proceedings ofthe 20^ IEEE VLSI Test Symposium p. 0081, April
28 - May 02,2002.
9. Miller, Russell B., Riordan, Walter C. "Unit Level Predicted Yield: a Metiiod of
Identifying High Defect Density Die at Wafer Sort." ITC Intemational Test
Conference Paper 40.3, p. 1118,2001.
10. Riordan, Walter C , Miller, Russell, Sherman, John M., Hicks, Jeffrey.
"Microprocessor Reliability Performance as a Function of Die Location for a
0.25^1, Five Layer Metal CMOS Logic Process." 37*^ Annual hitemational
Reliability Physics Symposium p. 1,1999.
11. Hess, Christopher, Weiland, Larg H. "Wafer Level Defect Density Distribution
Using Checkerboard Test Stmcttires." IEEE 1998 Int. Conference on
Mirrnelfirtronic Test Stiiictiires Vol. 11, March 1998.
67
12. <htix>://dictionary.reference.com/>
13. Sfreetinan, Ben G., Baneijee, Sanjay. Solid State Elecfronic Devices. Upper
Saddle River, New Jersey: Prentice Hall, Inc., 2000.
68
PERMISSION TO COPY
In presenting this thesis in partial fulfillment of the requfrements for a master's
degree at Texas Tech University or Texas Tech University Healtii Sciences Center, I
agree that tiie Library and my major department shall make itfreelyavailable for
research purposes. Permission to copy this thesis for scholarly purposes may be
granted by the Dfrector of the Library or my major professor. It is understood that any
copying or publication of this thesis for financial gain shall not be allowed without my
further written permission and that any user may be liable for copyright infiingement.
Agree (Permission is granted.)
Student Signature
Date
Disagree (Permission is not granted.)
Sttident Signattire
Date
Download