2001-03 - Department of Systems and Information Engineering

advertisement
2001 Systems Engineering Capstone Conference • University of Virginia
AN AUTOMATED DEFECT ANALYSIS SYSTEM
(D.O.U.G.I.E.)
FOR DOMINION SEMICONDUCTOR
Student team: Saiful Amin, Tim Bagnall, Jeff Binggeli, Jeff Bridgman
Faculty Advisors: Prof. Christina Mastrangelo and K. P. White,
Department of Systems Engineering
Client Advisors: Rick Newcomer and Mark Spinelli
Dominion Semiconductor
Manassas, VA
E-mail: rnewcome@dominionsc.com,
KEYWORDS: Semiconductor manufacturing, defect
analysis, clustering, data filtration
ABSTRACT
The goal of this project was to help Dominion
Semiconductor (DSC) automate its defect pattern
recognition and classification process in order to help
increase yield. The current system used by DSC is
based on experience and intuition, while the system
being developed by the Capstone team is an analytical
system for use in an automated environment. The
system developed by the team involves three phases:
test of randomness, data filtration, and cluster analysis.
A number of algorithms were evaluated for each phase.
For Phase 1, the Holgate N and Skellam-Moore spatial
statistical tests have demonstrated the greatest potential
for testing for randomness. A ‘mode-seeking’
algorithm proved to be the best at filtering out random
defects. Of the cluster analysis algorithms evaluated in
this project, the single-linkage approach was the most
accurate. A system that is automated and fully
integrated would benefit DSC by increasing defect
identification efficiency that would increase die/sort
yield and revenue.
INTRODUCTION
The process of developing a complete silicon wafer
with a set of chips is extremely complicated. On
average, Dominion completes the three-month
production process of about 1000 wafers a day with
close to 300 chips (or die) on each wafer (Spinelli
2000). Furthermore, throughout the three-month
duration of processing a chip, a wafer completes about
350 processing steps (Spinelli 2000). In such an
exhaustive form of development that requires
preciseness to mind boggling levels, lab environment
purity is crucial. Cleanliness of this nature is due to the
fact that the microscopic circuitry and processing chips
are easily prone to destruction by a particle that is
1/10000 the diameter of a human hair (Van Zant 1997).
In addition, countless chemicals and various forms of
bacteria can also plague wafer production throughout
processing.
Die/sort yield is the percentage of functional die
with respect to the total number of die initially
contained on the wafer. This yield is the primary
quantitative method for evaluating the value of a silicon
wafer and the effectiveness of the production process.
Dominion stores all of this information in their
databases to calculate the yield of each new line of
chips that they produce. Generally, at the initial stages
of processor chip production, the die/sort yield hovers
near the 60% – 70% (Spinelli 2000). Therefore, at the
beginning of a product line, silicon wafer production is
extremely inefficient because Dominion is losing
almost a third of their product.
With each wafer potentially worth $20,000 $40,000, this translates into a product with the
capability of becoming a profitable one given superior
output of wafers. Increasing the value of die/sort yield
would allow Dominion to sell their wafers at a higher
market value that would then lead to increased profits.
Currently, manual inspection of wafers is the
primary method of inspection of defective die on silicon
wafers. An individual at
Dominion reviews one
lot of picmaps (Fig. 1), a
visual representation of
the defective die on a
wafer, and singles out
noticeable errors for later
inspection and study.
Although this process
has proven to be
relatively successful,
Figure 1 - Picmap
there are opportunities
for improvement in this
19
An Automated Defect Analysis System (D.O.U.G.I.E.)
process. First, manual inspection includes the inherent
aspect of human error, which in turn could lead to false
classification or lack of classification of defect clusters.
The second problem with the existing process is that
not all of the wafers with low yield are inspected.
Thus, certain reoccurring errors might be overlooked.
This project improves this manual process of error
detection by attempting to automate the entire process
of defect detection and classification. The remaining
portion of this document discusses the methodology
and three-phases of our approach to improving this
process.
AUTOMATED SYSTEM DESIGN
system. Of the five tests of randomness, no method
was able to effectively distinguish between wafers with
random defects and wafers with clusters of defects.
However, after an examination of data produced during
evaluation of the tests of randomness, it appeared that a
few of the methods would perform at some level of
accuracy if the statistical thresholds were replaced with
empirical limits – thresholds chosen by the user as a
function of observing acceptable limits of defects. The
use of empirical limits proved to be advantageous,
especially when applying the Skellam-Moore and
Holgate N tests to wafers. These two tests of
randomness are revisited during discussion of the
evaluation of our automated system.
Phase One: Screening for Randomness
Phase Two: Data Filtration
The purpose of Phase 1 is to serve as a screening
process. In a typical group of wafers, some of the
wafers are going to exhibit obvious clusters of defects,
and other wafers will have defects that appear random
in nature. While the wafers with clusters and other
non-random patterns of defects can aid Dominion in
pinpointing manufacturing problems, the wafers with
random defects are of not much use in quality control.
Therefore, ideally, the automated system should only
analyze the wafers that do indeed have defect clusters. .
Then, only the wafers with defect clusters would
advance on to Phases 2 and 3 of the automated system.
The screening process increases the efficiency of the
automated system, as the substantial computational
resources involved in cluster analysis will not be
expended on the wafers determined to contain random,
unusable information.
Phase 1 is comprised of a test of randomness, a
statistical test that determines whether or not points in
an area are spatially random. Through research, several
tests of randomness were identified – the SkellamMoore, Hopkins F, Hopkins N, Holgate F, and Holgate
N tests (Ripley 1981). Each test has a statistic that is
calculated, based on a mathematical analysis of the
spatial area. Random samples of points on the wafer
are used in the calculations; therefore, the statistic can
take on a range of values for any specific wafer. This
statistic is compared to a threshold value, which is
dependent on several factors, such as the significance
level and statistical distribution involved. Determining
if randomness exists on a wafer depends on whether or
not the calculated statistic is above or below the
threshold.
After initial testing, it was discovered that none of
the tests of randomness performed in a manner that
would be beneficial to Phase 1 of our automated
After removing wafers that primarily consist of
random defects, this next phase involves filtering out
irrelevant data in the current set of wafers. The purpose
of the valley-seeking technique and the mode-seeking
procedure is to provide a clearer picmaps. In addition
to being visually clear, the two methods will also focus
on maintaining relevant data so that cluster analysis
20
Valley-seeking Algorithm:
The valley-seeking algorithm, with the term valley
referring to the parts of the wafer with a low frequency
of defects, will more or less be involved with removing
the isolated data points that may be found on a wafer.
In theory, the way this approach
XG XG XG
works is that a non-working die (XB)
will be selected from a picmap and
XG X
XG
an inspection will be done of the
surrounding eight die. The adjacent
XB XB XB
die will have their status checked
and then either classified as a
Figure 2 – 3x3 Grid
good die (XG) or bad die (XB)
as seen in Figure 2 (Diday 1994). Based on the number
of XB and XG variables and their location, the algorithm
will evaluate whether or not the original die X needs to
be eliminated (note: X will always be a bad die because
the algorithm only inspects dysfunctional die). For the
most part, the valley-seeking technique will throw away
the data if there is no connectivity with XB points
(Diday 1994). An example of connectivity would be
one or more adjacent XBs with X.
2001 Systems Engineering Capstone Conference • University of Virginia
The valley-seeking method
XG XG XG
calls for the removal of die with
XG X
XG
no connectivity, and this is
analogous to removing a random
XG XG XG
defect – Figure 3. This allows for
a clearer picture and aides the
process of clustering by
Figure 3 – 3x3 Grid
minimizing the number of
isolated data. Second, keeping die with at least one
connection enables detection of scratches
Mode-seeking Algorithm:
The mode seeking searches for an X and then does a
relationship check of the 3x3
frame that it is in. The location
XG XG XG
of Xwill only be preserved if
and only if a good portion of
XB X XB
B
the surrounding die are X and
have a high level of
XB XB XB
connectivity (Diday 1994).
This process helps to
Figure 4 – 3x3 Grid
establish the legitimacy of
large clusters in the data set so that a visual inspection
of the wafer will focus mainly on the major errors.
Considering that the primary purpose of the modeseeking procedure is to focus on the more defined and
larger clusters, the decision was made to approach the
wafers classification with a minimum of two
connections. One reason setting a reasonably high
connection minimum would be good is because it
would do away with a lot of the small clusters. In
addition, clustering could tackle the larger clusters first
so as to focus on the problems that are causing the loss
of the greatest amount of die. Another reason for
setting a reasonably high connection value is to
simplify the image of the picmap so that it is also easier
to comprehend visually.
Phase Three: Cluster Analysis
Once a wafer has been filtered, the defects will then
be clustered using a cluster analysis algorithm. Cluster
analysis is a pattern recognition technique used to
separate data into clusters without prior knowledge of
the composition of the clusters. The selected cluster
analysis algorithm will transform the set of wafer defect
points into a sequence of nested partitions by using
measurements of the distances between the defects.
Figure 5 shows a wafer that has been clustered. Notice
that the defects at the top of the wafer were assigned to
one cluster.
Figure 5 – Clustered Wafer
There are two main types of clustering algorithms,
hierarchical and non-hierarchical. In hierarchical
clustering each data point starts as a single cluster and
these clusters are merged together as the algorithm
progresses. When a hierarchical method assigns a data
point to a cluster, that point is part of that cluster for
good. A merge cannot be redone if it is clear at a later
stage that other clusters would have merged in a better
way. Non-hierarchical algorithms work in a similar
fashion, however non-hierarchical methods allow
clusters to change and evolve if it is found that a data
point fits better in another cluster. Another difference
is that in non-hierarchical clustering the number of
clusters in the data must be known before starting.
The clusters found using cluster analysis will be
used to classify defect patterns into signatures. For
example, the cluster found in Figure 5 would be
classified as a top edge defect. The identification of a
signature will allow DSC to fix the cause of the defect
pattern. The clusters will also be used to identify new
signatures. By using cluster analysis, smaller clusters
may be discovered that would go unnoticed using the
current system. If the same small cluster is found
across many wafers or many lots, this could be a signal
of a new signature. The identification and classification
of signatures will help DSC improve yield over the
lifetime of a product.
EXPERIMENT DESIGN AND RESULTS
After using Visual Basic to implement the three
phases into an automated system named D.O.U.G.I.E,
the next step was to evaluate the performance of the
three phases on wafer data.
Data Exploration
There were two primary sets of data that were dealt
with for this project. The first set of data came from
Dominion Semiconductor and their retired line of the
Gemini 128 Mb DRAM chips. Most of the data came
from the production periods in the months from
21
An Automated Defect Analysis System (D.O.U.G.I.E.)
September to December 1999. This was during the
initial stages of the production the Gemini wafer, which
means that many of the wafers were of low yield and
filled with error. The data included attributes like lot
number, wafer number, x and y-coordinates of errors,
pass/fail value, and the type of failure to occur for each
die.
The second set of data was simulated with a specific
collection of defect scenarios in mind. This artificial lot
was comprised of fourteen wafers with random error
and fifteen wafers with defined defects. Of the fifteen
that were not random, three wafers were assigned to
exhibit each of the five following well-known defect
patterns: vertical scratch, diagonal scratch, center
defect, outer radial defect, and edge defect. Figures 6
and 7 are examples of two of the wafers with simulated
random defects, and figures 8 and 9 are examples of
two of the wafers simulated with well-known defect
clusters.
Figure 6 – Random
Defects
Figure 7 – Random
Defects
Figure 8 – Center
Defect
Figure 9 – Radial
Defect
The primary purpose of the artificial lot was to test
the effectiveness of our approaches on wafers where it
was known exactly how our methods should behave
when functioning properly. For example, in an
experiment on a wafer with a center defect (Fig. [?]),
Phase 1 should not screen out the wafer because the
22
defects are not random, Phase 2 should not filter out the
defects that belong to the center cluster, and Phase 3
should assign all the defects in the center to one cluster.
Phase One Evaluation
The Skellam-Moore and Holgate N tests of
randomness were the two methods that had the potential
to be used to implement the screening process in Phase
1. Both the Skellam-Moore and Holgate N behave
differently when varying the number of spatial samples
included in the test. A sample is a point on the wafer
that is chosen randomly, and each point is the basis for
all mathematical computations involved in the test.
Through testing, it was determined that the Holgate N
was effective at one level of sampling – 60 samples.
The Skellam-Moore had multiple levels of sampling
that produced promising behavior. Three levels – 30,
60, and 90 samples – were examined in the final
evaluation of the Skellam-Moore test.
The application of the Holgate N and SkellamMoore tests to wafers presented two outcomes that were
important to the evaluation of Phase 1, the screening
process.
One, the “good” result, when the wafer in question
has random defects and the test correctly determines
that randomness exists. This is important, because if
few wafers with random defects are identified, the
screening process becomes inefficient.
Two, the “bad” result, when the wafer in question
has clusters of defects, but the test incorrectly
determines that randomness exists. This error is
detrimental to Dominion’s quality control, as the
screening process would discard wafers with important,
non-random information.
To evaluate the four methods, the Holgate N using
60 samples and Skellam-Moore using 30, 60, and 90
samples, they were applied to the artificial test lot and
three of the lots with Gemini data. The test of
randomness was applied twenty times to every wafer in
the four lots, and the following information was
recorded and then averaged (for the twenty trials): the
number of wafers determined to have random defects –
and of these wafers – the number of wafers known to
have random defects (good results), and the number of
wafers with known clusters of defects (bad results).
From this data, the following performance metrics were
calculated: the average percentage of wafers with
random defects that are screened out (good), the
average percentage of wafers with defect clusters that
are screened out (bad), and the ratio of these good and
bad percentages.
2001 Systems Engineering Capstone Conference • University of Virginia
Evaluation on the Artificial Lot
their classification of the wafers provided the basis
upon which to compare to what the tests of randomness
indicated. Like the experiment on the artificial test lot,
which had wafers with scratches, the three Gemini lots
had wafers with non-random patterns that would not be
detected by the tests of randomness. Again, these
wafers were not counted in the misclassification errors.
Since we created the wafers in our artificial test lot,
it was known exactly which wafers had random defects
and which wafers had clusters. Therefore, evaluation
was straightforward – Phase 1 should only screen out
the wafers with random defects. However, not all of
the non-random defects were strictly clusters; there
were six wafers with scratch defects, which appear as
lines instead of clusters. It was understood that the tests
of randomness were designed to only be sensitive to
clusters, meaning that scratches and other cluster-free
patterns would possibly be indicated as being random.
In evaluation, the wafers with scratches were accounted
for and the performance metrics were adjusted
accordingly. If a wafer with a scratch was determined
to have random defects, this was not counted as an
error.
Test
% of Wafers
% of Wafers
Good
w/ Random Defects
w/ Non-Random Defects
to
Screened Out
Screened Out
Bad
(Good)
(Bad)
Ratio
Holgate N, 60 Samples
24.43
12.33
1.98
Skellam-Moore, 30 Samples
26.67
21.57
1.24
0.93
Skellam-Moore, 60 Samples
Skellam-Moore, 90 Samples
10.03
10.73
26.37
18.73
1.41
Figure 11 – Gemini Lots Results
% of Wafers
% of Wafers
Good
w/ Random Defects
w/ Non-Random Defects
to
Screened Out
Screened Out
Bad
(Good)
(Bad)
Ratio
Holgate N, 60 Samples
32.9
1.1
29.91
Skellam-Moore, 30 Samples
52.5
5.6
9.38
Skellam-Moore, 60 Samples
Skellam-Moore, 90 Samples
33.2
1.7
19.53
61.8
5
12.36
Test
The results were not as strong as with the
experiment on the artificial test lot. Even still, the
methods still seemed that they could be beneficial in the
Phase 1 screening process. For example, the Holgate N
successfully screened out 24% of the wafers with
random defects, while the error was contained to12%
percent of the wafers with non-random defects. Out of
ten wafers with clusters, only about one of those wafers
would be incorrectly discarded – not an intolerable loss.
Figure 10 – Artificial Test Lot Results
Phase Two and Three Evaluation
The results shown above were very positive. For
each of the four tests, a range of 32 to 62 percent of the
wafers with random defects were successfully screened
out, while only a range of 1 to 6 percent of the wafers
with defect clusters were incorrectly determined to be
random. The Holgate N had the lowest amount of error
and the best “good to bad” ratio. The Skellam-Moore
tests had higher errors, but they successfully screened
out a larger percentage of the wafers with random
defects.
Evaluation on the Gemini Lots
For the lots of actual Dominion wafer data, there
was more involved in knowing which wafers had
random defects and which wafers had clusters. Without
priori knowledge, as with the wafers of our own
creation, we had to rely on Dominion’s interpretation of
the wafers. Dominion is the expert when it comes to
deciding if wafers have random defects or clusters, so
Five hierarchical clustering algorithms were
evaluated in this project. These algorithms were single
linkage, complete linkage, average linkage, centroid
method, and Ward’s method. These algorithms were
run on fifteen simulated wafers without data filtration
for an initial evaluation of the methods. The algorithms
are to be run on the same simulated data after being
filtered using the Mode seeking procedure as part of the
testing of the integrated system. In addition, the
algorithms will be run on real data before a final
product will be handed over to DSC. A combination of
a tool developed by the team called D.O.U.G.I.E. and
Minitab were used to test each algorithm. The X/Y
coordinates of each defect of a wafer were outputted
from D.O.U.G.I.E. and inputted into Minitab to perform
the clustering. As Minitab starts, each defect is an
individual cluster. These clusters are then merged
based on the distance criteria of the algorithm step by
step until all of the defects are grouped into one cluster.
23
An Automated Defect Analysis System (D.O.U.G.I.E.)
Results
Single-Linkage
Distance
0.77
0.51
0.26
1
2
6
3
17
27
13
18
9
4
5
7
8
10
11
12
14
15
16
19
20
21
23
24
25
28
29
30
31
33
34
35
38
39
40
42
43
44
47
48
49
52
53
54
56
57
58
59
61
37
41
46
50
45
55
60
22
26
32
36
51
0.00
Observations
Figure 12 – Single Linkage Dendogram
The initial testing on unfiltered data showed that
single-linkage was the most accurate of the clustering
algorithms tested. Single-linkage produced the least
amount of type I errors for each defect cluster tested. In
fact, only one type I error occurred using singlelinkage. At the same time, however, single-linkage
produced greater amounts of type II errors than the
other algorithms. The other algorithms (completelinkage, average-linkage, centroid method, and Ward’s
method) had very similar results to each other with each
having an unacceptable number of type I errors and
very few type II errors. At the time this paper is being
written, integrated testing using filtered data has not
been completed, however it is assumed that these
results will hold for filtered data.
Stopping the Algorithm
CONCLUSIONS
Before D.O.U.G.I.E. could display the results of the
clustering from Minitab, the number of clusters on a
wafer had to determined. By reviewing a wafer, it
becomes apparent how many clusters should result. For
example, if a wafer contains one defect cluster, and ten
random defects, then there should be roughly eleven
clusters on the wafer. Because Minitab shows how
many clusters result after a merge decision has been
made, it provides a convenient means of stopping the
algorithm where the amount of resulting clusters can be
chosen. Thus, the algorithms were run for each
clustering method until the amount of clusters specified
for a given wafer was reached. In later versions of this
system the stopping procedure will be an automated
process based on set criteria such as the overall yield of
the wafer.
In our automated system, the three-phased approach
demonstrates promising performance. Phase 1 and the
Holgate N and Skellam-Moore tests were able to detect
randomness on many of the wafers with random defects
and, for the most part, screened them out without
incorrectly including wafers with edge, center, and
outer radial clusters. Phase 2 successfully filtered out
the extraneous defects. The Mode Seeking Procedure
was especially effective at removing most of the
random defects and isolating the clusters on a wafer.
Phase 3 has the capability of analyzing the defects and
assigning them membership to a cluster. The Single
Linkage method could consistently form the clusters
that were expected to be identified. Phase 3 proved the
advantages of the synthesis of all three phases, as the
cluster analysis was more effective when performed
after Phase 1 and Phase 2 were completed.
It is recommended that Dominion Semiconductor
continue to pursue the three-phase approach – screening
for randomness, data filtration, and cluster analysis.
Hopefully, in the future, the automated system can be
developed to a point where it would be beneficial and
its implementation in operations would increase die/sort
yield and revenue.
Evaluation Methods
The clustering methods were evaluated on their
performance of capturing all the defects that belong to
the cluster scenario. In this process, I recorded two
possibilities of error, type I and type II, with the
clustering algorithm. Type I error: a defect belonging
to the cluster that was not captured by the algorithm.
Type II error: a defect not belonging to the cluster that
was captured by the algorithm. It was decided that
minimizing type I errors was more important than
minimizing type II because the clustering algorithm
must be able to identify as much of a defect cluster as
possible to aid in the classification of that cluster.
24
REFERENCES
Diday, Edwin. et al. (1994) New Approaches in
Classification and Data Analysis. Springer-Verlag,
New York.
Everitt, Brian S.(1993) Cluster Analysis. Arnold, New
York.
2001 Systems Engineering Capstone Conference • University of Virginia
Ripley, Brian D. (1981) Spatial Statistics. Wiley, New
York.
Spinelli, Mark. Powerpoint presentation to the UVA
Capstone team. 22 Sept. 2000.
Van Zant, P. (1997) Microchip Fabrication, 3rd ed.
McGraw-Hill, New York.
BIOGRAPHIES
Saiful Amin is a fourth-year Systems Engineering
major at the University of Virginia from Springfield,
VA. Within the Systems major, Saiful has a focus in
Computer Information Systems. His primary role in
this project was to develop the algorithms surrounding
the filtration process and implement it. Saiful has
accepted an IT consulting job with Accenture (formerly
known as Andersen Consulting) and will begin in
August.
Tim Bagnall is a fourth-year Systems Engineering
major from Springfield, VA. In addition to his Systems
major, he has also received a minor in Economics.
Tim’s primary role in the project was implementing and
evaluating cluster analysis.
Jeff Binggeli is a fourth-year Systems Engineering
major at the University of Virginia. Jeff is a member of
the varsity cross country and track and field teams at
the University. In addition to Systems Engineering,
Jeff is also working on a major in Economics.
Jeff Bridgman is a fourth-year Systems Engineering
major from Herndon, VA. His main responsibility in
the project was to implement and evaluate the
automated screening process that filters out the wafers
that are free of clusters of defects. Jeff has accepted a
position with Accenture and will begin in August.
25
Download