_________ __ ________________ 2.830J / 6.780J / ESD.63J Control of Manufacturing Processes (SMA...

advertisement
MIT OpenCourseWare
http://ocw.mit.edu
_________
___
2.830J / 6.780J / ESD.63J Control of Manufacturing Processes (SMA 6303)
Spring 2008
For information about citing these materials or our Terms of Use, visit: ________________
http://ocw.mit.edu/terms.
Control of
Manufacturing Processes
Subject 2.830/6.780/ESD.63
Spring 2008
Lecture #5
Probability Models, Parameter
Estimation, and Sampling
February 21, 2008
Manufacturing
1
The Normal Distribution
1
p(x) =
e
σ 2π
1 ⎛ x− μ⎞
− ⎜
⎟
2⎝ σ ⎠
2
0.4
“Standard normal”
z=
0.3
x−μ
σ
0.2
μz=0
σz=1
0.1
0
-4
-3
-2
-1
0
1
2
3
4
z
Manufacturing
2
Properties of the Normal pdf
• Symmetric about mean
• Only two parameters:
μ and σ
1
p(x) =
e
σ 2π
1 ⎛ x− μ⎞
− ⎜
⎟
⎝
2 σ ⎠
2
• Mean (μ) and Variance ( σ2 ) have well known
“estimators” (average and sample variance)
Manufacturing
3
Testing the Model:
e.g. Is the Process “Normal” ?
• Is the underlying distribution really normal?
– Look at histogram
– Look at curve fit to histogram
– Look at % of data in 1, 2 and 3σ bands
• Confidence Intervals
– Look at “kurtosis”
• Measure of deviation from normal
– Probability (or qq) plots (see Mont. 3-3.7 or MATLAB
stats toolbox)
Manufacturing
4
Kurtosis: Deviation from Normal
k = 0 - normal
k > 0 - more “peaked”
k < 0 - more “flat”
For sampled data:
4
⎡
n(n + 1)
3(n − 1)2
⎛ xi − x ⎞ ⎤
k=⎢
⎟ ⎥−
⎜
∑
⎢⎣ (n − 1)(n − 2)(n − 3) ⎝ s ⎠ ⎥⎦ (n − 2)(n − 3)
Manufacturing
5
Kurtosis for Some Common Distributions
D: Laplace (k = 3)
L: logistic (k = 1.2)
N: normal (k = 0)
U: uniform (k = -1.2)
Source: Wikimedia Commons, http://commons.wikimedia.org
Manufacturing
6
Quantile-Quantile (qq) Plots
• Plot
– normalized (mean
centered and
scaled to s)
vs.
– theoretical position
of unit normal
distribution for
ordered data
• Normal distribution:
data should fall along
line
Manufacturing
Source: Wikimedia Commons, http://commons.wikimedia.org
7
Guaranteeing “Normality”
The Central Limit Theorem
– If x1, x2 ,x3 ...xN … are N independent observations of
a random variable with “moments” μx and σ2x,
– The distribution of the sum of all the samples will
tend toward normal.
Manufacturing
8
Example: Uniformly Distributed Data
70
60
x1
50
40
Sum of 100 sets of
1000 points each
30
100
y = ∑ xi
20
10
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
+
i =1
150
70
60
50
x2
40
30
20
100
10
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
+...
50
70
60
50
x100
40
30
20
0
35
40
45
50
55
60
65
10
0
0
0.1
0.2
0.3
0.4
0.5
0.6
Manufacturing
0.7
0.8
0.9
1
9
Sampling: Using Measurements (Data)
to Model the Random Process
•
•
In general p(x) is unknown
Data can suggest form of p(x)
– e.g.. uniform, normal, weibull, etc.
•
Data can be used to estimate parameters of distributions
– e.g. μ and σ for normal distribution: p(x) = N(μ ,σ2)
•
How to estimate
– Sample Statistics
•
Uncertainty in estimates
1
p(x) =
e
σ 2π
1 ⎛ x− μ⎞
− ⎜
⎟
2⎝ σ ⎠
2
– Sample Statistic pdf’s
Manufacturing
10
Sample Statistics
Average or sample mean
Sample variance
Sample standard deviation
Manufacturing
11
Sample Mean Uncertainty
• If all xi come from a distribution with μx and σ2x, and
we divide the sum by n:
1 n
x = ∑ xi
n i =1
Then:
x = c1 x1 + c2 x2 + c3 x 3 + K cn xn
1
ci =
n
μx = μx
Manufacturing
and
1 2
1
σ = σ x or σ x =
σx
n
n
2
x
12
Manufacturing as Random Processes
• All physical processes have a degree of
natural randomness
• We can model this behavior using probability
distribution functions
• We can calibrate and evaluate the quality of
this model from measurement data
Manufacturing
13
Formal Use of Statistical Models
• Discrete Variable Distributions and Uses
– Attribute Modeling
• Sampling: Key distributions arising in sampling
• Chi-square, t, and F distributions
• Estimation:
– Reasoning about the population based on a sample
• Some basic confidence intervals
• Estimate of mean with variance known
• Estimate of mean with variance not known
• Estimate of variance
• Hypothesis tests
Manufacturing
14
Discrete Distribution: Bernoulli
Bernoulli trial: an experiment with two outcomes
Probability density function (pdf):
f(x)
¾
¼
p
1-p
0
Manufacturing
1
x
15
Discrete Distribution: Binomial
Repeated random Bernoulli trials
• n is the number of trials
• p is the probability of “success” on any one trial
• x is the number of successes in n trials
Manufacturing
16
Binomial Distribution
Binomial Distribution
1.2
1
0.8
0.25
Series1
0.6
0.4
0.2
29
27
25
23
21
19
17
15
9
13
7
11
5
3
1
0
0.2
e.g. expected
number of “rejects”
0.15
0.1
0.05
29
27
25
23
21
19
17
15
13
11
9
7
5
3
1
0
Number of "Successes"
Manufacturing
17
Discrete Distribution: Poisson
Mean:
Variance:
Example applications:
# misprints on page(s) of a book
# defects on a wafer
• Poisson is a good approximation to Binomial when n is large
and p is small (< 0.1)
Manufacturing
18
Poisson Distributions
Poisson Distribution
0.2
λ=5
0.18
0.16
0.14
0.12
0.1
c
0.08
Poisson Distribution
0.06
0.1
0.04
λ=20
0.09
0.02
0.08
0
1
3
5
7
9
11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41
Events per unit
0.07
0.06
Poisson Distribution
0.05
c
0.04
0.08
0.03
λ=30
0.07
0.06
0.02
0.01
0.05
0
1
0.04
3
5
7
9
11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41
Events per unit
0.03
0.02
e.g. defects/device
0.01
0
1
3
5
7
9
11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41
Events per unit
Manufacturing
19
Back to Continuous Distributions
• Uniform Distribution
• Normal Distribution
– Unit (Standard) Normal Distribution
Manufacturing
20
Continuous Distribution: Uniform
1
cdf
a
b
x
a
b
x
pdf
Manufacturing
21
Standard Questions For a Known cdf or pdf
• Probability x less than or
equal to some value
• Probability x sits within
some range
Manufacturing
1
a
b
x
a
b
x
22
Continuous Distribution: Normal or Gaussian
1
0.99865
cdf
0.977
0.84
0.5
0.00135
0.0227
0.16
0
pdf
Manufacturing
23
Continuous Distribution: Unit Normal
• Normalization
Mean
Variance
pdf
cdf
Manufacturing
24
Using the Unit Normal pdf and cdf
• We often want to talk
about “percentage points”
of the distribution – portion
in the tails
1
0.9
0.5
0.1
Manufacturing
25
Use of the pdf:
Location of Data
• How likely are certain values of the random
variable?
• For a “Standard Normal” Distribution:
z=
(x − μ )
σ
μ=0
N(0,1)
σ=1
0.4
z = 1 ⇒ x = 1σ
z = 2 ⇒ x = 2σ
z = 3 ⇒ x = 3σ
0.3
0.2
0.1
0
-4
Manufacturing
-3
-2
-1
0
1
2
3
4
26
Location of Data
P(-1≤ z ≤ 1) = P(z ≤1) - P(z ≤-1) = Φ(1) - Φ(-1)
(+ 1σ)
= 0.841 - (1-0.841) = 0.682
P(-2≤ z ≤ 2) = P(z ≤2) - P(z ≤-2) = 0.977 - (1-0.977) = 0.954
(+ 2σ)
P(-3≤ z ≤ 3) = P(z ≤3) - P(z ≤-3) = 0.998 - (1-0.998) = 0.997
(+ 3σ)
Φ(z) tabulated (e.g. p. 752 of Montgomery)
Manufacturing
27
Statistics
The field of statistics is about reasoning in the
face of uncertainty, based on evidence from
observed data
• Beliefs:
– Probability distribution or probabilistic model form
– Distribution/model parameters
• Evidence:
– Finite set of observations or data drawn from a
population (experimental measurements or
observations)
• Models:
– Seek to explain data wrt a model of their probability
Manufacturing
29
Sampling to Determine Parameters of the
Parent Probability Distribution
• Assume Process Under Study has a Parent Distribution p(x)
• Take “n” Samples From the Process Output (xi)
• Look at Sample Statistics (e.g. sample mean and sample
variance)
• Relationship to Parent
• Both are Random Variables
• Both Have Their Own Probability Distributions
• Inferences about the process (the parent distribution) via
Inferences about the derived sampling distribution
Manufacturing
30
Moments of the Population vs. Sample Statistics
Underlying model or
Population Probability
Sample Statistics
• Mean
• Variance
• Standard
Deviation
• Covariance
• Correlation
Coefficient
Manufacturing
31
Sampling and Estimation
• Sampling: act of making observations from
populations
• Random sampling: when each observation is
identically and independently distributed (IID)
• Statistic: a function of sample data; a value that can
be computed from data (contains no unknowns)
– Average, median, standard deviation
– Statistics are by definition also random variables
Manufacturing
32
Population vs. Sampling Distribution
Population
(“true”)probability density function)
n = 20
Sample Mean
(statistic)
Sample Mean Distribution
(sampling distribution)
n = 10
n=2
n=1
Manufacturing
33
Sampling and Estimation, cont.
• Sampling
• Random sampling
• Statistic
• A statistic is a random variable, which itself has a
sampling (probability) distribution
– I.e., if we take multiple random samples, the value for the
statistic will be different for each set of samples, but will be
governed by the same sampling distribution
• If we know the appropriate sampling distribution, we can
reason about the underlying population based on the
observed value of a statistic
– E.g. we calculate a sample mean from a random sample; in
what range do we think the actual (population) mean sits?
Manufacturing
34
Sampling and Estimation – An Example
•
Suppose we know that the thickness of
a part is normally distributed with std.
dev. of 10:
•
We sample n = 50 random parts and
compute the mean part thickness:
•
First question: What is distribution of
the mean of T =
•
Second question: can we use
knowledge of
distribution to reason
about the actual (population) mean μ
given observed (sample) mean?
Manufacturing
35
Estimation and Confidence Intervals
• Point Estimation:
– Find best values for parameters of a distribution
– Should be
• Unbiased: expected value of estimate should be true value
• Minimum variance: should be estimator with smallest variance
• Interval Estimation:
– Give bounds that contain actual value with a given
probability
– Must know sampling distribution!
Manufacturing
36
Confidence Intervals: Variance Known
• We know σ, e.g. from historical data
• Estimate mean in some interval to (1-α)100% confidence
Remember the unit normal
percentage points
1
Apply to the sampling
distribution for the sample
mean
Manufacturing
0.9
0.5
0.1
37
Example, Cont’d
•
Second question: can we use knowledge of
distribution to
reason about the actual (population) mean μ given observed
(sample) mean?
n = 50
95% confidence interval, α = 0.05
~95% of distribution
lies within +/- 2σ of mean
Manufacturing
38
Summary
• Process as Random Variable
– Histograms to pdf’s
• Different Distributions for Different Processes
– Discrete or Binary (e.g. Defects)
– Continuous (e.g. Dimensional Variation)
• Parent Distributions and Sampling
– Estimating the Parent from Data
• Use of Distributions to establish “Confidence” on
Parameter Estimates
Manufacturing
39
Download