Design of Engineering Experiments Chapter 2 – Some Basic Statistical Concepts • Describing sample data – – – – – Random samples Sample mean, variance, standard deviation Populations versus samples Population mean, variance, standard deviation Estimating parameters • Simple comparative experiments – The hypothesis testing framework – The two-sample t-test – Checking assumptions, validity Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 1 2.1 Introduction • Consider experiments to compare two conditions • Simple comparative experiments • Example: – – – – The strength of portland cement mortar Two different formulations: modified v.s. unmodified Collect 10 observations for each formulations Formulations = Treatments (levels) 2 Portland Cement Formulation (page 26) Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 3 2.2 Basic Statistical Concepts • Run = each observations in the experiment • Error = random variable • Graphical Description of Variability – Dot diagram: the general location or central tendency of observations – Histogram: central tendency, spread and general shape of the distribution of the data (Fig. 2-2) 4 Graphical View of the Data Dot Diagram, Fig. 2.1, pp. 26 Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 5 If you have a large sample, a histogram may be useful Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 6 Box Plots, Fig. 2.3, pp. 28 Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 7 • Probability Distributions • Mean, Variance and Expected Values 8 2.3 Sampling and Sampling Distribution • Random sampling • Statistic: any function of the observations in a sample that does not contain unknown parameters • Sample mean and sample variance • Properties of sample mean and sample variance – Estimator and estimate – Unbiased and minimum variance n n y yi / n, SS ( y i y ) 2 , S 2 SS /(n 1) i 1 i 1 9 Parameter, Statistic and Random Samples • The random variables X1, X2,…, Xn are said to form a (simple) random sample of size n if the Xi’s are independent random variables and each Xi has the sample probability distribution. We say that the Xi’s are iid. • A parameter is a number that describes the population. It is a fixed number, but in practice we do not know its value. • A statistic is a function of the sample data that does not contain unknown parameters. It is a random variable with a distribution function. Statistics are used to make inference about unknown population parameters. 10 Example – Sample Mean and Variance • Suppose X1, X2,…, Xn is a random sample of size n from a population with mean μ and variance σ2. • The sample mean is defined as 1 n X Xi. n i 1 • The sample variance is defined as 1 n 2 S X X . i n 1 i 1 2 11 • Degree of freedom: – Random variable y has v degree of freedom if E(SS/v) = 𝜎 2 – The number of independent elements in the sum of squares • The normal and other sampling distribution: – Sampling distribution – Normal distribution: The Central Limit Theorem – Chi-square distribution: the distribution of SS – t distribution – F distribution 12 The Central Limit Theorem(CLT) • Let X1, X2,…be a sequence of i.i.d random variables with E(Xi) = μ < ∞ n and Var(Xi) = σ2 < ∞. Let S n X i . i 1 Then, Z n S n n converges in distribution to Z ~ N(0,1) as 𝑛 → ∞ n where Z is a standard normal random variable. i.e. Z n Xn n converges in distribution to Z ~ N(0,1). 13 An Example of CLT • The picture shows one of the simplest types of test: rolling a fair die. • The more times you roll the die, the more likely the shape of the distribution of the means tends to look like a normal distribution graph. Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 14 Recall - The Chi Square distribution • If Z ~ N(0,1) then, X = Z2 has a Chi-Square distribution with parameter 1, i.e., X ~ 21 . • If X 1 ~ 2v , X 2 ~ 2v , , X k ~ 2v , all independent then 1 2 k k T X i ~ 2k v i 1 1 i 15 Recall - t distribution • Suppose Z ~ N(0,1) independent of X ~ χ2(n). Then, T Z X /v ~ t v . • Suppose X1, X2,…Xn are i.i.d normal random variables with mean μ and variance σ2. Then, X ~ t n1 S/ n 16 F distribution • Suppose X ~ χ2(n) independent of Y ~ χ2(m). Then, X /n ~ Fn ,m Y /m 17 2.4 Inferences about the Differences in Means, Randomized Designs • Use hypothesis testing and confidence interval procedures for comparing two treatment means. • Assume a completely randomized experimental design is used. (a random sample from a normal distribution) 18 The Hypothesis Testing Framework • Statistical hypothesis testing is a useful framework for many experimental situations • Origins of the methodology date from the early 1900s • We will use a procedure known as the twosample t-test Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 19 Definitions • Statistical hypothesis – An assertion about a distribution of one or more random variables. It is denoted by 𝐻0 and 𝐻1 (or 𝐻𝑎 ) • Statistical test – A procedure, based on the observed values of the random variables, that leads to the acceptance or rejection of 𝐻0 • Significance level – Probability of rejecting 𝐻0 when it is true • Critical region – The critical region is the set of points in the sample space, which leads to the rejection of 𝐻0 Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 20 2.4.1 Hypothesis Testing • Compare the strength of two different formulations: unmodified v.s. modified • Two levels of the factor • yij : the the jth observation from the ith factor level, i=1, 2, and j = 1,2,…, ni 21 • Model: 𝑦𝑖𝑗 = 𝜇𝑖 + 𝜖𝑖𝑗 • yij ~ N(𝜇𝑖, 𝜎𝑖2) • Statistical hypotheses: H : 0 1 2 H1 : 1 2 • Test statistic, critical region (rejection region) • Type I error, Type II error and Power 22 The two-sample t-test H 0 : 1 2 H1 : 1 2 y1 y 2 ~ N ( 1 2 , 12 / n1 22 / n2 ) If 2 12 22 and is known, then the testing statistic is Z0 Chapter 2 y1 y 2 1 / n1 1 / n2 Design & Analysis of Experiments 8E 2012 Montgomery 23 Type I error, Type II error and Power Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 24 Type I error, Type II error and Power https://www.geogebra.org/m/e8Usa8Pp Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 25 Determining Power • The power of test is the probability of rejecting the null hypothesis when in fact the alternative hypothesis is true. The power of the test is 𝑃(𝑅𝑒𝑗𝑒𝑐𝑡 𝐻0 | 𝐻𝑎 𝑖𝑠 𝑡𝑟𝑢𝑒) = 1 − 𝜷. Chapter 2 26 Example • Let X1 , 𝑋2 , . . , 𝑋𝑛 ~𝑁(50,36) • We want to test – 𝐻0 : 𝜇 = 50 – 𝐻𝑎 : 𝜇 = 55 𝑤𝑖𝑡ℎ 𝐶 = {𝑥:ҧ 𝑥ҧ > 53} • Find 𝛼 and 1 − 𝛽 (power of the test) for n=16 Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 27 How the Two-Sample t-Test Works: Use the sample means to draw inferences about the population means y1 y2 16.76 17.04 0.28 Difference in sample means Standard deviation of the difference in sample means 2 y 2 n , and 2 y1 y2 = 12 n1 22 n2 , y1 and y2 independent This suggests a statistic: Z0 y1 y2 12 n1 22 n2 If the variances were known we could use the normal distribution as the basis of a test Z0 has a N(0,1) distribution if the two population means are equal Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 28 If we knew the two variances how would we use Z0 to test H0? Suppose that σ1 = σ2 = 0.30. Then we can calculate Z0 y1 y2 12 n1 22 n2 0.28 0.32 0.32 10 10 0.28 2.09 0.1342 How “unusual” is the value Z0 = -2.09 if the two population means are equal? It turns out that 95% of the area under the standard normal curve (probability) falls between the values Z0.025 = 1.96 and - Z0.025 = -1.96. So the value Z0 = -2.09 is pretty unusual in that it would happen less that 5% of the time if the population means were equal Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 29 Standard Normal Table (see appendix) Z0.025 = 1.96 Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 30 So if the variances were known we would conclude that we should reject the null hypothesis at the 5% level of significance H 0 : 1 2 H1 : 1 2 and conclude that the alternative hypothesis is true. This is called a fixed significance level test, because we compare the value of the test statistic to a critical value (1.96) that we selected in advance before running the experiment. The standard normal distribution is the reference distribution for the test. Another way to do this that is very popular is to use the P-value approach. The P-value can be thought of as the observed significance level. For the Z-test it is easy to find the P-value. Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 31 Normal Table Find the probability above Z0 = -2.09 from the table. This is 1 – 0.98169 = 0.01832 The P-value is twice this probability, or 0.03662. So we would reject the null hypothesis at any level of significance that is less than or equal to 0.03662. Typically 0.05 is used as the cutoff. Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 32 The t -Test • The Z-test just described would work perfectly if we knew the two population variances. • Since we usually don’t know the true population variances, what would happen if we just plugged in the sample variances? • The answer is that if the sample sizes were large enough (say both n > 30 or 40) the Z-test would work just fine. It is a good large-sample test for the difference in means. • But many times that isn’t possible (as Gosset wrote in 1908, “…but what if the sample size is small…?). • It turns out that if the sample size is small we can no longer use the N(0,1) distribution as the reference distribution for the test. Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 33 Estimation of Parameters 1 n y yi estimates the population mean n i 1 n 1 2 2 2 S ( yi y ) estimates the variance n 1 i 1 Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 34 Summary Statistics (pg. 38) Modified Mortar Unmodified Mortar “New recipe” “Original recipe” y1 16.76 y2 17.04 S 0.100 S 22 0.061 S1 0.316 S 2 0.248 n1 10 n2 10 2 1 Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 35 Use S12 and S 22 to estimate 12 and 22 The previous ratio becomes y1 y2 2 1 2 2 S S n1 n2 However, we have the case where 12 22 2 Pool the individual sample variances: (n1 1) S ( n2 1) S S n1 n2 2 2 p 2 1 2 2 36 How the Two-Sample t-Test Works: Use S and S to estimate and 2 1 2 2 2 1 The previous ratio becomes 2 2 y1 y2 2 1 2 2 S S n1 n2 However, we have the case where 2 1 2 2 2 Pool the individual sample variances: (n1 1) S ( n2 1) S S n1 n2 2 2 p Chapter 2 2 1 2 2 Design & Analysis of Experiments 8E 2012 Montgomery 37 How the Two-Sample t-Test Works: The test statistic is y1 y2 t0 Sp 1 1 n1 n2 • Values of t0 that are near zero are consistent with the null hypothesis • Values of t0 that are very different from zero are consistent with the alternative hypothesis • t0 is a “distance” measure-how far apart the averages are expressed in standard deviation units • Notice the interpretation of t0 as a signal-to-noise ratio Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 38 The Two-Sample (Pooled) t-Test (n1 1) S12 (n2 1) S 22 9(0.100) 9(0.061) S 0.081 n1 n2 2 10 10 2 2 p S p 0.284 y1 y2 t0 Sp 1 1 n1 n2 16.76 17.04 1 1 0.284 10 10 2.20 The two sample means are a little over two standard deviations apart Is this a "large" difference? Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 39 The Two-Sample (Pooled) t-Test • We need an objective basis for deciding how large the test statistic t0 really is • In 1908, W. S. Gosset derived the reference distribution for t0 … called the t distribution • Tables of the t distribution – see textbook appendix page 614 Chapter 2 t0 = -2.20 Design & Analysis of Experiments 8E 2012 Montgomery 40 Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 41 The Two-Sample (Pooled) t-Test • A value of t0 between –2.101 and 2.101 is consistent with equality of means • It is possible for the means to be equal and t0 to exceed either 2.101 or –2.101, but it would be a “rare event” … leads to the conclusion that the means are different • Could also use the P-value approach Chapter 2 t0 = -2.20 Design & Analysis of Experiments 8E 2012 Montgomery 42 The Two-Sample (Pooled) t-Test t0 = -2.20 • • • • The P-value is the area (probability) in the tails of the t-distribution beyond -2.20 + the probability beyond +2.20 (it’s a two-sided test) The P-value is the probability of obtaining test results at least as extreme as the result actually observed, under the assumption that the null hypothesis is true (tail end probability under H0) The P-value the risk of wrongly rejecting the null hypothesis of equal means (it measures rareness of the event) The exact P-value in our problem is P = 0.042 (found from a computer) Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 43 Approximating the P-value Our t-table only gives probabilities greater than positive values of t. So take the absolute value of t0 = -2.20 or |t0|= 2.20. Now with 18 degrees of freedom, find the values of t in the table that bracket this value. These are 2.101 < |t0|= 2.20 < 2.552. The right-tail probability for t = 2.101 is 0.025 and for t = 2.552 is 0.01. Double these probabilities because this is a two-sided test. Therefore the P-value must lie between these two probabilities, or 0.05 < P-value < 0.02 These are upper and lower bounds on the P-value. We know that the actual P-value is 0.042. Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 44 Computer Two-Sample t-Test Results Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 45 Checking Assumptions – The Normal Probability Plot Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 46 • Checking Assumptions in the t-test: – Equal-variance assumption – Normality assumption • Normal Probability Plot: y(j) v.s. (j – 0.5)/n Tension Bond Strength Data ML Estimates Form 1 99 Form 2 Goodness of Fit 95 AD* 90 1.209 1.387 Percent 80 70 60 50 40 30 20 10 5 1 16.5 17.5 Data 18.5 47 • Estimate mean and variance from normal probability plot: – Mean: 50 percentile – Variance: the difference between 84th and 50th percentile • Transformations 48 Importance of the t-Test • Provides an objective framework for simple comparative experiments • Could be used to test all relevant hypotheses in a two-level factorial design, because all of these hypotheses involve the mean response at one “side” of the cube versus the mean response at the opposite “side” of the cube Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 49 2.4.3 Confidence Intervals • Hypothesis testing gives an objective statement concerning the difference in means, but it doesn’t specify “how different” they are • General form of a confidence interval L U where P ( L U ) 1 • The 100(1- α)% confidence interval on the difference in two means: y1 y2 t / 2,n1 n2 2 S p (1/ n1 ) (1/ n2 ) 1 2 y1 y2 t / 2,n1 n2 2 S p (1/ n1 ) (1/ n2 ) Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 50 Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 51 Sample Size Determination • How accurate do you want your estimate to be? – When we assume equal sample sizes of n, a confidence interval for μ1−μ2 is given by: {𝑌ത1 − 𝑌ത2 ± 𝑡(1 − 𝛼/2; 𝑑𝑓) ⋅ 𝑠 ⋅ 2/𝑛} • where n is the sample size for each group, and df = n + n - 2 = 2(n - 1) and s is the pooled standard deviation. Since in practice, we don't know what s will be, prior to collecting the data, we will need a guesstimate of σ to substitute into this equation. Chapter 2 52 Power vs. Sample Size Chapter 2 53 What if the Two Variances are Different? Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 54 Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 55 Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 56 Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 57 Chapter 2 58 Chapter 2 59 2.5 Inferences about the Differences in Means, Paired Comparison Designs • Example: Two different tips for a hardness testing machine • 20 metal specimens • Completely randomized design (10 for tip 1 and 10 for tip 2) • Lack of homogeneity between specimens • An alternative experiment design: 10 specimens and divide each specimen into two parts. 60 Example: Two different tips for a hardness testing machine Testing machine Specimen Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 61 • The statistical model: yij i j ij , i 1,2, j 1,2,,10 – 𝜇𝑖 is the true mean hardness of the ith tip, – 𝛽𝑗 is an effect due to the jth specimen, – 𝜖𝑖𝑗 is a random error with mean zero and variance 𝜎i2 • The difference in the jth specimen: d j y1 j y 2 j • The expected value of this difference is d E (d j ) E ( y1 j y 2 j ) 1 2 62 • Testing 𝜇1 = 𝜇2 <=> testing 𝜇𝑑 = 0 • The test statistic for H0: 𝜇𝑑 = 0 v.s. H1: 𝜇𝑑 ≠ 0 t0 • • • • d Sd / n Under H0, t0 ~ tn-1 (paired t-test) Paired comparison design Block (the metal specimens) Several points: – Only n-1 degree of freedom (2n observations) – Reduce the variance and narrow the C.I. (the noise reduction property of blocking) 63 2.6 Inferences about the Variances of Normal Distributions • • • • Test the variances Normal distribution Hypothesis: H0: 𝜎2 = 𝜎02 v.s. H1:𝜎2 ≠ 𝜎02 The test statistic is 2 SS ( n 1 ) S 02 2 2 0 0 2 2 ~ • Under H0, 0 n 1 2 2 ( n 1 ) S ( n 1 ) S 2 • The 100(1-) C.I.: 2 2 / 2,n 1 1( / 2), n 1 64 • Hypothesis: H0:𝜎12 = 𝜎22 v.s. H1:𝜎12 ≠ 𝜎22 • The test statistic is F0 = S12/ S22 , and under H0, F0 = S12/ S22 ~ Fn1-1,n2-1 • The 100(1-) C.I.: S12 12 S12 F1 / 2,n2 1,n1 1 2 2 F / 2,n2 1,n1 1 2 S2 2 S2 65 Chapter 2 Design & Analysis of Experiments 8E 2012 Montgomery 66