Supplemental Material for Chapter 5

advertisement
Supplemental Material for Chapter 5
S5-1. s2 is not Always an Unbiased Estimator of 2
An important property of the sample variance is that it is an unbiased estimator of the
population variance, as demonstrated in Section S3-3 of the Supplemental Text Material.
However, this unbiased property depends on the assumption that the sample data has
been drawn from a stable process; that is, a process that is in statistical control. In
statistical quality control work we sometimes make this assumption, but if it is incorrect,
it can have serious consequences on the estimates of the process parameters we obtain.
To illustrate, suppose that in the sequence of individual observations
x1 , x2 , , xt , xt 1 , , xm
the process is in-control with mean 0 and standard deviation  for the first t
observations, but between xt and xt+1 an assignable cause occurs that results in a sustained
shift in the process mean to a new level    0   and the mean remains at this new
level for the remaining sample observations xt 1 , , xm . Under these conditions,
Woodall and Montgomery (2000-01) show that
E (s 2 )   2 
t (m  t )
( ) 2 .
m(m  1)
(S5-1)
In fact, this result holds for any case in which the mean of t of the observations is 0 and
the mean of the remaining observations is  0   , since the order of the observations is
not relevant in computing s2. Note that s2 is biased upwards; that is, s2 tends to
overestimate 2. Furthermore, the extent of the bias depends on the magnitude of the
shift in the mean (), the time period following which the shift occurs (t), and the
number of available observations (m). For example, if there are m = 25 observations and
the process mean shifts from 0 to   0   (that is,   1) between the 20th and the
21st observation (t = 20), then s2 will overestimate 2 by 16.7% on average. If the shift in
the mean occurs earlier, say between the 10th and 11th observations, then s2 will
overestimate 2 by 25% on average.
The proof of Equation S5-1 is straightforward. Since we can write
s2 
1  m 2

xi  mx 2 


m  1  i 1

then
E (s 2 )  E
1  m 2
1  m

2
x

mx

E ( xi2 )  mE ( x 2 ) 


i



m  1  i 1
 m  1  i 1

Now
1
1
m 1
F
I 1 F
I
E ( x )J
E ( x )   E ( x )J
G
G


H K m 1H
K
1

ct (   )  (m  t )(   )  (m  t ) h
m1
1

ct  (m  t )(   )  m h
m1
m
t
m
2
i
i 1
2
i
2
i
i 1
i  t 1
2
0
2
2
2
0
2
0
2
2
0
and
1
m
mE ( x 2 ) 
m1
m1
L
mtI I  O
F
F


 J  P
M
G
J
G
H
K
H
K mP
m
M
N
Q
2
2
0
Therefore
2
2

1  2
 m  t    
2
2
 t 0  (m  t )( 0   )  m   m   0  
E (s ) 



 


m 1 
m
m






2
2
1  2

 mt   
2
 
t 0  (m  t )( 0   )  m  0  
   
m  1 
 m   

2

1 
(m  t ) 2
2
 
(m  t )( ) 
( ) 2 

m 1 
m

2
2 
1 
 (m  t )  
(m  t )( ) 2 1 


m 1 
m  

2 
t (m  t )
( ) 2
m(m  1)
S5-2. Should We Use d 2 or d 2* in Estimating  via the Range Method?
In the textbook, we make use of the range method for estimation of the process standard
deviation, particularly in constructing variables control charts (for example, see the
x and R charts of Chapter 5). We use the estimator R / d 2 . Sometimes an alternative
estimator, R / d 2* , is encountered. In this section we discuss the nature and potential uses
of these two estimators. Much of this discussion is adapted from Woodall and
Montgomery (2000-01). The original work on using ranges to estimate the standard
deviation of a normal distribution is due to Tippett (1925). See also the paper by Duncan
(1955).
2
Suppose one has m independent samples, each of size n, from one or more populations
assumed to be normally distributed with standard deviation . We denote the sample
ranges of the m samples or subgroups as R1 , R2 , , Rm . Note that this type of data arises
frequently in statistical process control applications and gauge repeatability and
reproducibility (R & R) studies (refer to Chapter 7). It is well-known that
E ( Ri )  d2 and Var ( Ri )  d32 2 , for i  1,2,, m where d2 and d3 are constants that
depend on the sample size n. Values of these constants are tabled in virtually all
textbooks and training materials on statistical process control. See, for example
Appendix table VI for values of d2 and d3 for n = 2 to 25.
There are two estimators of the process standard deviation  based on the average sample
range,
m
R
R
i
i 1
,
m
that are commonly encountered in practice. The estimator
 1  R / d 2
(S5-2)
(S5-3)
is widely used after the application of control charts to estimate process variability and to
assess process capability. In Chapter 3 we report the relative efficiency of the range
estimator given in Equation (S5-3) to the sample standard deviation for various sample
sizes. For example, if n = 5, the relative efficiency of the range estimator compared to
the sample standard deviation is 0.955. Consequently, there is little practical difference
between the two estimators. Equation (S5-3) is also frequently used to determine the
usual 3-sigma limits on the Shewhart x  chart in statistical process control. The
estimator
 2  R / d2*
(S5-4)
is more often used in gauge R & R studies and in variables acceptance sampling. Here
d 2* represents a constant whose value depends on both m and n. See Chrysler, Ford, GM
(1995), Military Standard 414 (1957), and Duncan (1986).
Patnaik (1950) showed that R /  is distributed approximately as a multiple of a  distribution. In particular, R /  is distributed approximately as d 2*  /  , where 
represents the fractional degrees of freedom for the  distribution. Patnaik (1950) used
the approximation
1
1
5 

d 2*  d 2 1 


.
2
4 32
128 3 

(S5-5)
It has been pointed out by Duncan (1986), Wheeler (1995), and Luko (1996), among
others, that  1 is an unbiased estimator of  and that  22 is an unbiased estimator of 2.
3
For  22 to be an unbiased estimator of 2, however, David (1951) showed that no
approximation for d 2* was required. He showed that
d2*  (d22  Vn / m)1/ 2 ,
(S5-6)
where Vn is the variance of the sample range with sample size n from a normal
population with unit variance. It is important to note that Vn  d 32 , so Equation (S5-6)
can be easily used to determine values of d 2* from the widely available tables of d2 and d3.
Thus, a table of d 2* values, such as the ones given by Duncan (1986), Wheeler (1995), and
many others, is not required so long as values of d2 and d3 are tabled, as they usually are
(once again, see Appendix Table VI). Also, use of the approximation
1 

d 2  d 2 1  
 4 
given by Duncan (1986) and Wheeler (1995) becomes unnecessary.
The table of d 2* values given by Duncan (1986) is the most frequently recommended. If a
table is required, the ones by Nelson (1975) and Luko (1996) provide values of d 2* that
are slightly more accurate since their values are based on Equation (S5-6).
It has been noted that as m increases, d 2* approaches d2. This has frequently been argued
using Equation (19-4) and noting that  increases as m increases. The fact that
d 2* approaches d2 as m increases is more easily seen, however, from Equation (19-5) as
pointed out by Luko (1996).
Sometimes use of Equation (S5-4) is recommended without any explanation. See, for
example, the AIAG measurement systems capability guidelines [Chrysler, Ford, and GM
(1995)]. The choice between  1 and  2 has often not been explained clearly in the
literature. It is frequently stated that the use of Equation (S5-3) requires that R be
obtained from a fairly large number of individual ranges. See, for example, Bissell
(1994, p. 289). Grant and Leavenworth (1996, p. 128) state that “Strictly speaking, the
validity of the exact value of the d2 factor assumes that the ranges have been averaged for
a fair number of subgroups, say, 20 or more. When only a few subgroups are available, a
better estimate of  is obtained using a factor that writers on statistics have designated as
d 2* .” Nelson (1975) writes, “If fewer than a large number of subgroups are used,
Equation (S5-3) gives an estimate of  which does not have the same expected value as
the standard deviation estimator.” In fact, Equation (S5-3) produces an unbiased
estimator of  regardless of the number of samples m, whereas the pooled standard
deviation does not (refer to Section S3-5 of the Supplemental Text Material). The choice
between  1 and  2 depends upon whether one is interested in obtaining an unbiased
estimator of  or 2. As m increases, both estimators (S5-3) and (S5-4) become
equivalent since each is a consistent estimator of .
It is interesting to note that among all estimators of the form cR (c  0), the one
minimizing the mean squared error in estimating  has
4
c  d 2 / (d 2* ) 2 .
The derivation of this result is in the proofs at the end of this section. If we let
 3 
d2
R
( d 2* ) 2
then it is shown in the proofs below that
F
d I
G
H (d ) J
K
MSE ( 3 )  1 
2
2
* 2
2
Luko (1996) compared the mean squared error of  2 in estimating  to that of  1 and
recommended  2 on the basis of uniformly lower MSE values. By definition,  3 leads to
further reduction in MSE. It is shown in the proofs at the end of this section that the
percentage reduction in MSE using  3 instead of  2 is
Fd  d IJ
50G
Hd K
*
2
2
*
2
Values of the percentage reduction are given in Table S5-1. Notice that when both the
number of subgroups and the subgroup size are small, a moderate reduction in mean
squared error can be obtained by using  3 .
Table S5-1.
Percentage Reduction in Mean Squared Error from using
 3 instead of  2
Subgroup
Size, n
2
3
4
5
6
7
8
9
10
1
10.1191
5.7269
4.0231
3.1291
2.5846
2.2160
1.9532
1.7536
1.5963
2
5.9077
3.1238
2.1379
1.6403
1.3437
1.1457
1.0058
0.9003
0.8176
Number of Subgroups, m
3
4
5
7
4.1769 3.2314 2.6352 1.9251
2.1485 1.6374 1.3228 0.9556
1.4560 1.1040 0.8890 0.6399
1.1116 0.8407 0.6759 0.4856
0.9079 0.6856 0.5507 0.3952
0.7726 0.5828 0.4679 0.3355
0.6773 0.5106 0.4097 0.2937
0.6056 0.4563 0.3660 0.2623
0.5495 0.4138 0.3319 0.2377
10
1.3711
0.6747
0.4505
0.3414
0.2776
0.2356
0.2061
0.1840
0.1668
15
0.9267
0.4528
0.3017
0.2284
0.1856
0.1574
0.1377
0.1229
0.1114
20
0.6998
0.3408
0.2268
0.1716
0.1394
0.1182
0.1034
0.0923
0.0836
5
Proofs
Result 1: Let   cR , then MSE ( )   2 [c2 (d2* ) 2  2cd2  1]
Proof:
MSE ( )  E[(cR   ) 2 ]
 [c 2 R 2  2cR   2 ]
 c 2 E ( R 2 )  2cE ( R )   2
Now E ( R 2 )  Var ( R )  [ E ( R )]2  d32 2 / m  (d2 ) 2 . Thus
MSE ( )  c 2 d 32 2 / m  c 2 d 22 2  2c (d 2 )   2
  2 [c 2 (d 32 / m  d 22 )  2cd 2  1]
  2 [c 2 (d 2* ) 2  2cd 2  1]
Result 2: The value of c that minimizes the mean squared error of estimators of the form
cR in estimating  is
d2
.
(d 2* ) 2
Proof:
MSE ( )   2 [c 2 (d 2* ) 2  2cd 2  1]
dMSE ( )
  2 [2c(d 2* ) 2  2d 2 ]  0
dc
d
 c  *2 2 .
(d 2 )
Result 3: The mean square error of  3 
F
G
H
IJ
K
d2
d
R is  2 1  *2 2 .
* 2
(d 2 )
(d 2 )
Proof:
6
L
O(from result 1)
d
d
(d )  2
d  1P
M
(d )
N(d )
Q
Ld
O
d
= M 2
 1P
N(d ) (d ) Q
F1  d IJ
 G
H (d ) K
MSE ( 3 )   2
2
2
2
* 4
2
* 2
2
2
2
* 2
2
2
2
* 2
2
2
2
2
* 2
2
2
2
* 2
2
Note that MSE ( 3 )  0 as n   and MSE ( 3 )  0 as m  .
L
M
N
O
P
Q
MSE ( 2 )  MSE ( 3 )
R
d
 100 , the
and  3  *2 2 R . Then
*
MSE ( 2 )
d2
(d 2 )
percent reduction in mean square error using the minimum mean square error estimator
R
instead of * [as recommended by Luko (1996)], is
d2
Result 4: Let  2 
Fd  d IJ
50G
Hd K
*
2
2
*
2
Proof:
Luko (1996) shows that MSE ( 2 ) 
2 2 (d 2*  d 2 )
, therefore
d 2*
7
MSE ( 2 )  MSE ( 3 ) 
F
G
H
2
2
2
2
2
L
2( d  d ) ( d )  d

M
(d )
Nd
L
2(d  d ) (d  d )(d  d ) O

M
P
d
(d )
N
Q
(d  d ) F d  d I
G
d
H2  d J
K
(d  d ) F
d d I
G
d
Hd J
K
*
2
* 2
2
2
2
* 2
2
2
*
2
*
2
*
2
2
2
*
2
*
2
*
2
2
*
2
*
2
* 2
2
2
2
*
2
*
2
*
2
2
*
2
2
*
2
(d 2*  d 2 ) 2
(d 2* ) 2
L
Fd  d IJ.
MSE ( )  MSE ( ) O
 (d  d ) / (d )
 100 
 100  50G
M
P
Hd K
N MSE ( ) Q 2 (d  d ) / (d )
2
Consequently
IJ
K
O
P
Q
2 2 (d 2*  d 2 )
d 22
2


1

d 2*
(d 2* ) 2
2
*
2
3
2
2
2
2
*
2
2
* 2
2
*
2
*
2
2
*
2
S5-3. Determining When the Process has Shifted
Control charts monitor a process to determine whether an assignable cause has occurred.
Knowing when the assignable cause has occurred would be very helpful in it’s
identification and eventual removal. Unfortunately, the time of occurrence of the
assignable cause does not always coincide with the control chart signal. In fact, given
what is known about average run length performance of control charts, it is actually very
unlikely that the assignable cause occurs at the time of the signal. Therefore, when a
signal occurs, the control chart analyst should look earlier in the process history to
determine the assignable cause.
But where should we start? The Cusum control chart provides some guidance – simply
search backwards on the Cusum status chart to find the point in time where the Cusum
last crossed zero (refer to Chapter 8). However, the Shewhart x control chart provides no
such simple guidance.
Samuel, Pignatiello, and Calvin (1998) use some theoretical results by Hinkley (1970) on
change-point problems to suggest a procedure to determine the time of a shift in the
process mean following a signal on the Shewhart x control chart. They assume the
standard x control chart with in-control value of the process mean  0 . Suppose that the
chart signal at subgroup average xT . Now the in-control subgroups are x1 , x2 ,..., xt , and
the out-of-control subgroups are xt 1 , xt  2 ,..., xT , where obviously t  T . Their procedure
consists of finding the value of t in the range 0  t  T that maximizes
Ct  (T  t )( xT ,t  0 )2
8
1 T
 xi is the reverse cumulative average; that is, the average of the T – t
T  t i t 1
most recent subgroup averages. The value of t that maximizes Ct is the estimator of the
last subgroup that was selected from the in-control process.
where xT ,t 
You may also find it interesting and useful to read the material on change point
procedures for process monitoring in Chapter 9.
S5-4. More about Monitoring Variability with Individual Observations
As noted in the textbook, when one is monitoring a process using individual (as opposed
to subgrouped) measurements, a moving range control chart does not provide much
useful additional information about shifts in process variance beyond that which is
provided by the individuals control chart (or a Cusum or EWMA of the individual
observations). Sullivan and Woodall (1996) describe a change-point procedure that is
much more effective that the individuals (or Cusum/EWMA) and moving range chart.
Assume that the process is normally distributed. Then divide the n observations into two
partitions of n1 and n2 observations, with n1 = 2, 3, … , n – 2 observations in the first
partition and n – n1 in the second. The log-likelihood of each partition is computed, using
the maximum likelihood estimators for  and  2 in each partition. The two loglikelihood functions are then added. Call the sum of the two log-likelihood functions La.
Let L0 denote the log-likelihood computed without any partitions. Then find the
maximum value of the likelihood ratio r = –2(La – L0). The value of n1 at which this
maximum value occurs is the change point; that is, it is the estimate of the point in time at
which a change in either the process mean or the process variance (or both) has occurred.
Sullivan and Woodall show how to obtain a control chart for the likelihood ratio r. The
control limits must be obtained either by simulation or by approximation. When the
control chart signals, the quantity r can be decomposed into two components; one that is
zero if the means in each partition are equal, and another that is zero if the variances in
each partition are equal. The relative size of these two components suggests whether it is
the process mean or the process variance that has shifted.
S5-5. Detecting Drifts versus Shifts in the Process Mean
In studying the performance of control charts, most of our attention is directed towards
describing what will happen on the chart following a sustained shift in the process
parameter. This is done largely for convenience, and because such performance studies
must start somewhere, and a sustained shift is certainly a likely scenario. However, a
drifting process parameter is also a likely possibility.
Aerne, Champ, and Rigdon (1991) have studies several control charting schemes when
the process mean drifts according to a linear trend. Their study encompasses the
Shewhart control chart, the Shewhart chart with supplementary runs rules, the EWMA
control chart, and the Cusum. They design the charts so that the in-control ARL is 465.
Some of the previous studies of control charts with drifting means did not do this, and
9
different charts have different values of ARL0, thereby making it difficult to draw
conclusions about chart performance. See Aerne, Champ, and Rigdon (1991) for
references and further details.
They report that, in general, Cusum and EWMA charts perform better in detecting trends
than does the Shewhart control chart. For small to moderate trends, both of these charts
are significantly better than the Shewhart chart with and without runs rules. There is not
much difference in performance between the Cusum and the EWMA.
S5-6. The Mean Square Successive Difference as an Estimator of 2
An alternative to the moving range estimator of the process standard deviation  is the
mean square successive difference as an estimator of  2 . The mean square successive
difference is defined as
n
1
MSSD 
( x1  xi 1 ) 2

2(n  1) i 1
It is easy to show that the MSSD is an unbiased estimator of  2 . Let x1 , x2 ,..., xn be a
random sample of size n from a population with mean  and variance  2 . Without any
loss of generality, we may take the mean to be zero. Then
n
 1

E ( MSSD )  E 
( xi  xi 1 ) 2 

 2(n  1) i  2

1
 n


E   ( xi2  xi21  2 xi xi 1 ) 
2(n  1)  i  2

1
[(n  1) 2  (n  1) 2 ]
2(n  1)
2(n  1) 2


2(n  1)

2
Therefore, the mean square successive difference is an unbiased estimator of the
population variance.
10
Download