A Non-Linear Neural Classifier and its Applications in Testing Analog/RF Circuits*

advertisement
A Non-Linear Neural Classifier
and its Applications in Testing
Analog/RF Circuits*
Haralampos Stratigopoulos & Yiorgos Makris
Departments of Electrical Engineering & Computer Science
YALE UNIVERSITY
* Partially supported by NSF, SRC, IBM & Intel
Test and Reliability @ YALE
http://eng.yale.edu/trela
Current Research Areas
 Analog/RF circuits
 Machine learning-based approach for test cost reduction
 Design of on-chip checkers and on-line test methods
 Neuromorphic built-in self-test (BIST)
 Reliability
 Concurrent error detection / correction methods (FSMs)
 RTL Concurrent error detection in microprocessor controllers
 Design transformations for improved soft-error immunity (logic)
 Asynchronous circuits
 Fault simulation, ATPG, test compaction for SI circuits
 Test methods for high-speed pipelines (e.g. Mousetrap)
 Concurrent error detection and soft-error mitigation
Outline
• Testing analog/RF circuits
• Current practice
• Limitations and challenges
• Machine learning-based testing
• General idea
• Regression vs. classification
• Testing via a non-linear neural classifier
• Construction and training
• Selection of measurements
• Test quality vs. test time trade-off
• Experimental results
• Analog/RF specification test compaction
• Neuromorphic BIST
• Summary
Analog/RF IC Test – Current Industry Practice
• Post-silicon production flow
Chip
Wafer
Automatic Test
Interface Board
Equipment (ATE)
• Current practice is specification testing
ATE
test configurations
chip
performance
parameters
compare
pass/fail
design
specifications
Limitations
Test Cost:
• Expensive ATE (multi-million dollar equipment)
• Specialized circuitry for stimuli generation and
response measurement
Test Time:
• Multiple measurements and test configurations
• Switching and settling times
Alternatives?
• Fault-model based test – Never really caught on
• Machine learning-based (a.k.a. “alternate”) testing
– Regression (Variyam et al., TCAD’02)
– Classification (Pan et al., TCAS-II’99, Lindermeir et al., TCAD’99)
Machine Learning-Based Testing
General idea:
• Determine if a chip meets its specifications without
explicitly computing the performance parameters
and without assuming a prescribed fault model
How does it work?
• Infer whether the specifications are violated through a
few simpler/cheaper measurements and information
that is “learned” from a set of fully tested chips
Underlying assumption:
• Since chips are produced by the same manufacturing
process, the relation between measurements and
performance parameters can be statistically learned
Regression vs. Classification
Problem Definition:
specification tests
(T1, … , Tk)
simple
functions
performance
parameters 
compare
pass/fail
Use machine-learning to
approximate these functions
alternate tests
(x1, … , xn)
design
specifications
unknown,complex,
non-linear functions
(no closed-form)
Regression:
Classification:
• Explicitly learn
these functions
(i.e. approximate
f:x → )
• Implicitly learn
these functions
(i.e. approximate
f:x→Y,Y={pass/fail})
Overview of Classification Approach
training set of chips
simpler measurements
0.02
0.015
0.01
0.005
x2
acquisition of
measurement patterns
specification tests
0
-0.005
Nominal patterns
pass/fail labels
projection on
measurement space
-0.01
-0.015
-0.025
Faulty patterns
-0.02
-0.015
-0.01
-0.005
0
0.005
x1
LEARNING
new (untested) chip
learn boundary
trained classifier
pass/fail label
measurement pattern
TESTING
0.01
0.015
0.02
Order of Separation Boundary
0.5
Nominal patterns
Faulty patterns
x2
0
-0.5
-1
0.7
0.8
0.9
1
1.1
1.2
1.3
1.4
1.5
1.6
x1
• The boundary that best separates the nominal from the faulty
classes in the space of the simple measurements can be
highly non-linear
Previous Work: Linear Methods
(Lindermeir et al., TCAD’99, Pan et al., TCAS-II’99, Variyam et al., TCAD’00)
0.5
b1
b2
b5
Nominal patterns
b6
Faulty patterns
x2
0
b3
-0.5
b4
-1
0.6
0.7
0.8
0.9
1
1.1
1.2
1.3
1.4
1.5
1.6
x1
•  i  n, allocate a hyper-plane bi that optimally separates
the patterns when labeled with regards to specification i
• Acceptance region is bounded by intersection
• Decomposition into n two-class problems solved individually
Our Solution: Non-Linear Neural Classifier
• Allocates a single boundary of arbitrary order
• No prior knowledge of boundary order is required
• Constructed using linear perceptrons only
• The topology is not fixed, but it grows (ontogeny)
until it matches the intrinsic complexity of the
separation problem
Linear Perceptron
yi
perceptron connectivity
synapses
wid
wi0
wi1
xo  1
x1
perceptron output
xd
geometric interpretation
x2
yi (x)  1
yi (x)
for nominal
1
threshold
activation
function
-1
0
d
 wi j xj
j 0
training
adjusts wi
j
x1 to minimize
error
yi (x)  -1
for faulty
d
Nominal patterns
Faulty patterns
w
j 0
ij
xj 
0
Topology of Neural Classifier
yi
• Pyramid structure
– First perceptron receives
the pattern x  Rd
– Successive perceptrons
also receive inputs from
preceding perceptrons
and a parabolic pattern
xd+1=xi2, i=1…d
wi , y1
y1
wi , yi -2
w
yi -2 yi -i1, yi -1 wi0
xo 1
y1
wi1
yi -3 y
i -2
y2
wid
wid +1
xo 1
• Every newly added
layer assumes the role
of the network output
y1
xo 1
wi0
xo 1
wid
wi1
x1
xd
xd +1 
d
x
i 1
2
i
• Non-linear boundary by
training a sequence of
linear perceptrons
Training and Outcome
• Weights of added layer are adjusted through the
thermal perceptron training algorithm
• Weights of preceding layers do not change
• Each perceptron separates its input space linearly
• Allocated boundary is non-linear in the original space x  Rd
Theorem:
• The sequence of boundaries allocated by the neurons
converges to a boundary that perfectly separates the
two populations in the training set
Boundary Evolution Example (layer 0)
2.5
2
1.5
1
x2
0.5
0
-0.5
-1
-1.5
-2
-2
-1.5
-1
-0.5
0
0.5
1
1.5
x1
Nominal patterns correctly classified
Nominal patterns erroneously classified
Faulty patterns correctly classified
Faulty patterns erroneously classified
Boundary Evolution Example (layer 1)
2.5
2
1.5
1
x2
0.5
0
-0.5
-1
-1.5
-2
-2
-1.5
-1
-0.5
0
0.5
1
1.5
x1
Nominal patterns correctly classified
Nominal patterns erroneously classified
Faulty patterns correctly classified
Faulty patterns erroneously classified
Boundary Evolution Example (layer 2)
2.5
2
1.5
1
x2
0.5
0
-0.5
-1
-1.5
-2
-2
-1.5
-1
-0.5
0
0.5
1
1.5
x1
Nominal patterns correctly classified
Nominal patterns erroneously classified
Faulty patterns correctly classified
Faulty patterns erroneously classified
Boundary Evolution Example (output layer)
2.5
2
1.5
1
x2
0.5
0
-0.5
-1
-1.5
-2
-2
-1.5
-1
-0.5
0
0.5
1
1.5
x1
Nominal patterns correctly classified
Nominal patterns erroneously classified
Faulty patterns correctly classified
Faulty patterns erroneously classified
Matching the Inherent Boundary Order
100
95
Is Higher Order Always Better?
rate (%)
• No! The goal is to generalize
• Inflexible and over-fitting
boundaries
90
85
training set
validation set
80
75
70
0
2
4
6
8
number of layers
Finding The Trade-Off Point (Early Stopping)
• Monitor classification on validation set
• Prune down network to layer that achieves
the best generalization on validation set
• Evaluate generalization on test set
10
12
14
Are All Measurements Useful?
4
4
3
3
2
2
1
x2
x2
1
0
0
-1
-1
-2
-2
Nominal patterns
Nominal patterns
-3
Faulty patterns
Faulty patterns
-3
-3
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
x1
Non-Discriminatory
1.5
2
-4
-4
-3
-2
-1
0
1
2
x1
Linearly Dependent
3
4
Curse of Dimensionality
Nominal training patterns
Faulty training patterns
New patterns
• By increasing the dimensionality we may reach
a point where the distributions are very sparse
• Several possible boundaries exist – choice is random
• Random label assignment to new patterns
Genetic Measurement Selection
• Encode measurements in a bit string, with the k-th bit
denoting the presence (1) or the absence (0) of the k-th
measurement
GENERATION
GENERATION
t
t+1
0110111001
0010111011
0000111001
0010111001
1010111000
1010111000
0100101011
0100101000
Reproduction
Crossover &
Mutation
0111101000
0111110000
1110110000
0110110000
Fitness function:
NSGA II: Genetic algorithm with multi-objective fitness
function reporting pareto-front for error rate (gr) and
number of re-tested circuits (nr)
Two-Tier Test Strategy
Highly Accurate
Pass/Fail
Decision
Inexpensive Machine Learning Test
All Chips
Low-Cost
Tester
Simple
Alternative
Measurements
Expensive Specification Test
Design
Specs
Compare
Specification
Test
Measurements
Highly Accurate Pass/Fail Decision
Neural
Classifier
Few
Chips
High-Cost
Tester
Most
Chips
Measurement
Pattern in
Guard-band
Reasons for Non-Zero Classification Error
1.9
1.85
test hypersurface
x2
1.8
1.75
Nominal patterns
1.7
Faulty patterns
1.65
2.25
2.3
2.35
2.4
2.45
2.5
2.55
x1
• Continuity of analog signals and parametric drifts
• Finite training set
Guard-bands
1.9
1.85
test hypersurface
1.8
x2
faulty guard-band
1.75
1.7
Nominal patterns
nominal guard-band
Faulty patterns
Guard-banded region
1.65
2.25
2.3
2.35
2.4
2.45
2.5
2.55
x1
• Introduce a trichotomy in the measurement space
in order to assess the confidence of the decision
Testing Using Guard-bands
1.9
Re-test via
specification test
1.85
test hypersurface
1.8
x2
faulty guard-band
Classify chip to
dominant class
1.75
1.7
Nominal patterns
nominal guard-band
Faulty patterns
Guard-banded region
1.65
2.25
2.3
2.35
2.4
2.45
2.5
2.55
x1
• Examine measurement pattern with respect to the guard-bands
• Identify circuits that need to be re-tested via specification tests
Guard-band Allocation
Nominal patterns
Faulty patterns
D
D
D
• Guard-bands are allocated separately
• Identify the overlapping regions
• Clear the overlapping regions
• The ontogenic classifier allocates the guard-band
Guard-band Allocation
Nominal patterns
Faulty patterns
D
D
D
D
• The guard-banded region can been varied by controlling D
Experiment 1: Switch-Capacitor Filter
C1
C1A
C1B
2
Vin
2
2
1
1
1
C2A
C3C
C1C 
C2
1
2
2
C1D
1
C2B
C3
C3A
2
C3D
2
1
Specifications considered
• Ripples in stop- and
pass-bands
• Gain errors
• Group delay
• Phase response
• THD
1
C4A
C3B
C4
1
2
2
1
C4B
C5
C5C
C5B
2
Vout
2
1
1
C5A
Experiment Setup
• N=2000 instances
• N/2 assigned to the
training set, N/4 to
the validation set and
N/4 to the test set
White-Noise Test Stimulus
pseudo-random bit sequence
white noise
Type-1 LFSR
CLK
LP filter
Din
DUT
Qm
Qn
digitizer
DUT
signature
4.0
pseudo-random bit sequence
Voltage (V)
2.0
white noise
0
DUT
signature
ts
-2.0
0
t1 t2
tr
2
4
6
8
10
Time (ms)
12
14
16
t30
18
Test Time vs. Test Accuracy Trade-Off
8
f1
f2
f3
7
6
test error (%)
5
4
guard-bands test time=15ms
specification test time >0.5s
(0%,4%)
3
2
(8.1%,2%)
1
(15.6%,0.6%)
0
0
5
10
15
(20.7%,0%)
20
25
30
retested circuits (%)
35
40
45
Experiment 2: RFMD2411
• 541 devices (training set 341, validation set 100, test set 100)
• Considered 13 specs @850MHz (7 different test configurations)
Gain
3rd Order
Intercept
Noise
Figure
LNA
x
x
x
x
x
x
Mixer
x
x
x
x
-
-
LNA + Mixer
(Cascade)
x
x
x
-
-
-
Other S Parameters
Test Configuration
Test
Signal
xt(t)
Mixer
LPF
xs(t)
FFT
DUT
f1
DUT
signature
f2
450
Stimulus
400
28 tones
DUT Response
FFT of DUT response
350
300
250
200
150
100
50
0
noise floor
0
1
2
3
4
5
6
frequency
7
8
9
10
x 10
7
Test Time vs. Test Accuracy Trade-Off
8
7
f1
f2 (gr<2)
f2 (gr<4)
f3
6
test error (%)
5
guard-bands test time=5ms
specification test time ~1s
(0%,3.92%)
4
3
2
(11.67%,1.96%)
1
(27.62%,0%)
0
0
5
10
15
20
25
30
retested circuits (%)
35
40
45
50
Analog/RF Specification Test Compaction
Non-disruptive application:
• Instead of “other, alternate measurements” use
existing inexpensive specification tests to predict
pass/fail for the chip and eliminate expensive ones
Case-Study:
•
•
•
•
•
RFCMOS zero-IF down-converter for cell-phones
Fabricated at IBM, currently in production
Data from 944 devices (870 pass – 74 fail)
136 performances (65 non-RF – 71 RF)
Non-RF measurements assumed much cheaper than
RF, aim at minimizing number of measurements
Setup and Questions Asked
Setup:
• Split 944 devices into training and test set (472 devices
each, chosen uniformly at random.
• Repeat 200 times and average to obtain statistically
significant results
Question #1:
• How accurately can we predict pass/fail of a device
using only a subset of non-RF measurements?
Question #2:
• How does this accuracy improve by selectively adding
a few RF measurements to this subset?
Results
Only non-RF performances:
• 15 measurement suffice to
predict correctly over 99%
of the devices (in our case
4 out of 472 are on
average mispredicted
Adding RF performances:
• By adding to these 15 nonRF performances 12 RF
performances, error drops to
0.38% (i.e. 1 out of 472
devices mispredicted)
What’s Next? Neuromorphic BIST
BIST Input
Output
On-Chip
Stimulus
Generator
Input
Analog
Mux
MixedSignal
/
RF
Circuit
Programmable
Response
Acquisition
BIST
Control
Trainable
On-Chip Neural
Classifier
BIST Output
A stand-alone BIST method for mixed-signal/RF circuits:
• On-chip generation of simple test stimulus
• On-chip acquisition of simple measurements
• On-chip implementation of trainable neural classifier
Summary
Contribution:
• A non-linear neural classifier supporting a novel
paradigm for testing analog/RF circuits
– Achieves significant reduction in test cost
without sacrificing accuracy of specification test
– Enables exploration of trade-off between
test cost and test accuracy via guard-bands
Applications:
• Non-disruptive: Specification test compaction
• Disruptive: Machine learning-based test
• Futuristic: Neuromorphic BIST
Publications:
More Information
•
H-G. D. Stratigopoulos, Y. Makris, “Constructive Derivation of Analog
Specification Test Criteria,” Proceedings of the IEEE VLSI Test Symposium
(VTS), pp. 245-251, 2005
•
H-G. D. Stratigopoulos, Y. Makris, “Non-Linear Decision Boundaries for
Testing Analog Circuits,” IEEE Transactions on Computer-Aided Design of
Integrated Circuits and Systems (T.CAD), pp. 1360-1373, Nov. 2005
H-G. D. Stratigopoulos, Y. Makris, “Bridging the Accuracy of Functional and
Machine-Learning-Based Mixed-Signal Testing,” Proceedings of the IEEE VLSI
Test Symposium (VTS), pp. 406-411, 2006
•
•
•
H-G. D. Stratigopoulos, P. Drineas, M. Slamani, Y. Makris, “Non-RF to RF Test
Correlation Using Learning Machines: A Case Study,” Proceedings of the IEEE
VLSI Test Symposium (VTS), pp. 9-14, 2007
H-G. D. Stratigopoulos, Y. Makris, “Error Moderation in Low-Cost Machine
Learning-Based Analog/RF Testing ,” IEEE Transactions on Computer-Aided
Design of Integrated Circuits and Systems (T.CAD), pp. 339-351, Feb. 2008
Contact:
•
E-mail: yiorgos.makris@yale.edu, haralampos.stratigopoulos@imag.fr
Download