Chapter 10A Planning Tests to Compare Populations or Processes William Q. Meeker and Luis A. Escobar Iowa State University and Louisiana State University Copyright 1998-2010 W. Q. Meeker and L. A. Escobar. Complements to the authors’ text Statistical Methods for Reliability Data, John Wiley & Sons Inc., 1998. January 13, 2014 3h 42min 10A - 1 Chapter 10A Planning Comparison of Populations or Processes Objectives • Describe general issues in planning two or more processes or populations. • Describe the planning to compare two population means or quantiles assuming that the population variances are equal. • Describe generalization of the procedures to compare populations with different variances. 10A - 2 Comparison of 2 Location Parameters µ2 µ1 ∆ d= ∆ /σ µ2 µ1 10A - 3 Comparing Populations or Processes • Product design decisions often require choosing the best from among k different populations or processes. • Suppose that response (e.g., failure time or strength) from population i follows a log-location-scale (e.g., Weibull or lognormal) distribution ! log(t) − µi F (t; µi, σ) = Φ . σ • Parameter µi varies; constant σ. • Specific interest in choosing the population with the largest distribution quantile value (tp)i = exp[µi + σΦ−1(p)], i = 1, . . . , k among the k populations. • With constant σ the ordering of the (tp)i values is the same as the ordering of the µi values. 10A - 4 Tests Comparing k Populations or Processes • Samples of equal size n to be taken from each population. • If interest centers on the lower tail of the distribution, there is little reason to wait for all units to fail (in a life test). • In some situations data may be censored. For example ◮ Testing all units from all populations simultaneously until a given censoring time tc results in Type I censored data. ◮ Testing each population (perhaps sequentially because of a limitation in test positions) until a given number of failures occurs results in Type II censored data. 10A - 5 Tests Comparing k Populations or Processes–Continued • To select the population with the largest quantile, if σ is the same in each group, choose the population with b i. To plan the test, supposing that µ1 > the largest µ max {µ2, . . . , µk }, one would want an assessment of b 1 > max {µ b 2, . . . , µ b k }) . Pr(CS) = Pr(Correct Selection) = Pr (µ 10A - 6 Comparison of 5 Location Parameters µ2 µ4 µ1 µ5 µ3 d= ∆ /σ µ2 = µ3 = µ4 = µ5 ∆ µ1 10A - 7 General Formula to Compute Pr(CS) when Selecting the Population with the Largest µ • Notation: b 1, . . . , µ bk) : f (µ b 1, . . . , µ bk) joint distribution of (µ b 1) : marginal of µ b1 f (µ b 2, . . . , µ bk | µ b 1) : conditional distribution of (µ b 2, . . . , µ b k ) given µ b1 f (µ • The general formula to compute Pr(CS) is b2 ≤ µ b 1, . . . , µ bk ≤ µ b 1) Pr(CS) = Pr (µ = Z ∞ −∞ b 1)dµ b 1. b2 ≤ µ b 1, . . . , µ bk ≤ µ b1 | µ b 1 ) f (µ Pr (µ 10A - 8 Probability of Correct Selection when Choosing the b i Based on Type II Censored Data Largest µ from k Populations • With Type II censoring resulting in r of n failures from each population and any log-location-scale distribution, there is a relatively simple expression for Pr(CS). • For the case µ1 > µ2 = µ3 = · · · = µk , which is the least favorable situation to choose the correct population, the probability of correct selection simplifies to b 1 > max {µ b 2, . . . , µ b k }) Pr(CS) = Pr (µ h n o = Pr Zb1 > max Zb2, . . . , Zbk − d b i − µi)/σ and d = (µ1 − µ2)/σ. where Zbi = (µ i , • The joint distribution of (Zb1, . . . , Zbk ) depends only on the definition of Φ, n, and r and can be evaluated effectively with simulation without having to specify d. • Simulate with given n and r. Plot Pr(CS) versus d. Repeat for different n and r combinations. 10A - 9 Probability of Correct Selection when Choosing the b i Based on Type II Censored Data Smallest µ from k Populations • There is a similar expression for the probability of correct selecting the smallest µi from a log-location-scale distribution. • For the case µ1 < µ2 = µ3 = · · · = µk , the probability of correct selection simplifies to b 1 < min {µ b 2, . . . , µ b k }) Pr(CS) = Pr (µ n o i b b b = Pr Z1 < min Z2, . . . , Zk − d , h b i − µi)/σ and d = (µ1 − µ2)/σ. where Zbi = (µ • The joint distribution of (Zb1, . . . , Zbk ) depends only on the definition of Φ, n, and r and can be evaluated effectively with simulation without having to specify d. • Simulate with given n and r. Plot Pr(CS) versus d. Repeat for different n and r combinations. 10A - 10 Elicitation of a Planning Value for d • For a log-location-scale distribution, a planning value d✷ for d can be obtained as a function of a planning value for σ and a specified increase on life of interest. • Suppose that there is interest in detecting a percent increase of p, 0 < p < 100, or more on life and that a planning value σ ✷ for σ is available. Then ◮ Equating 1 + p/100 to the increase on life, one gets exp µ1 + σz(1−α) p 1+ = 100 exp µ2 + σz(1−α) Taking logarithms = exp (µ1 − µ2) . p ∆ = µ1 − µ2 = log 1 + . 100 ◮ Then 1 p ∆ . = log 1 + d✷ = ✷ ✷ σ σ 100 10A - 11 Probability of Correct Selection when Choosing the Largest Weibull Characteristic Life Based on Type II Censored Data from k = 2 Populations Probability of Correctly Selecting the Largest Weibull Population out of 2 Populations Pr(Correct Selection) 1.0 0.9 0.8 n=40, r=24 n=20, r=12 n=10, r=6 n=5, r=3 0.7 0.6 0.5 0.0 0.5 1.0 1.5 2.0 Standardized Difference (d) 10A - 12 Probability of Correct Selection when Choosing the Largest Weibull Characteristic Life Based on Type II Censored Data from k = 5 Populations Probability of Correctly Selecting the Largest Weibull Population out of 5 Populations Pr(Correct Selection) 1.0 0.8 0.6 n=40, r=24 n=20, r=12 n=10, r=6 n=5, r=3 0.4 0.2 0.0 0.5 1.0 1.5 2.0 Standardized Difference (d) 10A - 13 Probability of Correct Selection when Choosing the Largest Log-Mean based on Type I Censored Data from k Populations • With Type I censoring, test units from all populations are tested until a prespecified censoring time tc. • The expression for Pr(CS) is the same as with Type II censoring but the joint distribution of (Zb1, . . . , Zbk ) depends on the definition of Φ, n, (log(tc) − µ1)/σ, and (log(tc) − µ2)/σ. • Pr(CS) can be evaluated with simulation, but a new simulation is needed for every different specified combination of n, µ1, µ2, σ, and tc (or n, µ1/σ, µ2/σ, and log(tc)/σ). • There may be a possibility of 0 failures (if so, this will often show up in the simulation). • If the sample size n in each group is large enough, the Pr(CS) versus d curves for Type II censoring provide an approximation to the sample size needed for Type I censoring. 10A - 14 Extensions • Results for the selection of the distribution with the smallest quantile are similar. • Evaluations when σ values differ among populations. Specification of population characteristics is much more complicated. • Other censoring schemes, and stopping rules. For example, stop testing when at least r units fail in each group or at time tc, which ever comes first. • Dynamic rule to allow termination of simultaneous testing when there is confidence that Pr(CS) is at least a specified value. • Non-log-location-scale distributions (e.g., gamma). • Bayesian approaches to allow incorporating prior information. • Estimation of Pr(CS) after data have been collected. 10A - 15 Special Case: Probability of Correct Selection for Complete Samples from a Lognormal Distribution b i is • For complete samples from a lognormal distribution µ the mean of the logtimes to failure. • Then b i ∼ NOR(µi, σ 2/n) µ • In this case there are some simpler formulas to compute Pr(CS) and the required sample size to obtain a prespecified Pr(CS). 10A - 16 b 2, . . . , µ bk) Special Case: Conditional Distribution of (µ for Underlying Lognormal Distributions and Complete Data b i denote the sample mean of the log failure times from Let µ the sample from population i. b2 ≤ µ b1 . . . , µ bk ≤ µ b1 | µ b 1) = Pr (µ = = where z1 = √ k Y i=2 k Y i=2 k Y i=2 bi ≤ µ b 1) Pr (µ √ Φnor b 1 − µi ) n(µ σ ! √ Φnor z1 + di n , b 1 − µ1)/σ and di = (µ1 − µi)/σ. n(µ 10A - 17 Special Case: Probability of Correct Selection when Choosing the Largest Log-Mean with Complete Data from k Lognormal Populations • Although we would expect all of the µi values to be different, a conservative (i.e., lower) value of Pr(CS) is obtained by supposing that there there is a best population and that all of the k − 1 other populations have the same value of µ (i.e., µ1 > µ2 = µ3 = · · · = µk ). • In this case, Pr(CS) = Z ∞ h −∞ √ ik−1 Φnor z1 + d n φnor (z1) dz1, where d = (µ1 − µ2)/σ. • In this case, Pr(CS) is easy to compute numerically. 10A - 18 Special Case: Sample Size to Give a Specified Probability of Correct Selection when Choosing the Largest Log-Mean with Complete Data from k Lognormal Populations • The sample size required to obtain a probability of correct selection equal to 1 − α, when detection of a difference of µ1 − µ2 is important, is given by n = c2/d2 where d = (µ1 − µ2)/σ and c is the solution to the equation 1−α = Z ∞ −∞ [Φnor (z1 + c)]k−1 φnor (z1) dz1. • To identify small differences one needs large samples and to identify large differences smaller samples are sufficient. 10A - 19