Hypothesis Testing

DISTRIBUTIONS AND HYPOTHESIS TESTING Area Under the Normal Curve 35.0  2.5 30.4 (-3) 32.6 (-2) 34.8 (-) 37 () 39.2 (+) 43.6 41.4 (+3) (+2) The Normal curve is symmetric about the mean, so the area under the curve may be determined from a table that has entries for only half of the values. The Standard Normal Distribution has a mean of 0 and a standard deviation of 1; and the total area under the curve is exactly 1. In order to adapt the table to a generic situation, we compute a change in variable, defined as: z x  where: x is the point in the interval,  is the distribution mean, and  is the distribution standard deviation. When we do this, z now measures how many standard deviation units the x-value is from the mean. The table lists the area [(z)] under the normal curve from - to z, so the entries in the table require us to look up the value of  corresponding to the computed z. [Note: Due to symmetry, (-z) = 1 - (z)] Example: A process has a distribution that is Normally distributed, with a mean of 37.0 and a standard deviation of 2.2. The specification calls for a value of 35.0  2.5. Estimate the proportion of the process output that will meet specifications. Solution: 1. Convert the problem to a Standard Normal Distribution and find the area under the Standard Normal Curve from - to the upper specification limit. 2. Convert the problem to a Standard Normal Distribution and find the area under the Standard Normal Curve from - to the lower specification limit 3. Subtract (2.) from (1.) to get the process yield. Step 1.)  = 37  = 2.2 x  35.0  2.5  37.5 37.5  37.0 z  0.23 2.2 (0.23)  0.5910 Step 2.)  = 37  = 2.2 x  35.0  2.5  32.5 32.5  37.0 z  2.05 2.2 (  2.05)  1  ( 2.05)  1  0.9798  0.0202 Step 3.) 0.5910  0.0202  0.5708  57% 687303317 Page 1 of 8 Hypothesis Testing Terms Statistic is an estimate of a parameter or calculation about or from a distribution. A distribution is a mathematical function that captures the frequency (or probability) that a given situation will occur. Population is a term that refers to the complete set of observations. If we have the complete set (or even a very large set of observations), then our estimates (statistics) should be very accurate. We have special identifiers for these estimates:  is the symbol for the population mean; 2 is the symbol for the population variance (  is for the population standard deviation); and N will represent the population size, or the number of elements in the population. Sample is a term that refers to a subset of the population, selected at random (meaning every element has an equal chance of being selected), and selected independently (meaning any of our prior selections did not bias our current selection). We also have special identifiers for the statistics estimated from samples: x is the symbol for the sample mean; S2 is the symbol for the sample variance (S is the symbol for the sample standard deviation), and n is the sample size. Hypothesis is a guess about a situation, that can be tested and can be either true or false. The Null Hypothesis has a symbol H0, and is always the default situation that must be proven beyond a reasonable doubt. The Alternative Hypothesis is denoted by the symbol HA (H1 in some texts) and can be thought of as the opposite of the Null Hypothesis - it can also be either true or false, but it is always false when H0 is true and vice-versa. There are two types of comparisons, and three forms for the alternative hypothesis, illustrated in Figure 1 (below). The first type of HA says that there is a difference, but the direction of the difference is not known (or needed). In this case, it is a two-sided comparison, and the Null hypothesis is rejected if the test statistic would be placed in either of the tail regions representing 1/2 of the rejection criteria area (see Fig. 1(b)). Otherwise, the Null hypothesis is kept. The second type of HA says that there is a difference, and the direction of the difference is anticipated (or required). In this case, there is a one-sided comparison, and the Null hypothesis is rejected if the test statistic would be placed in the tail region representing the entire rejection criteria area. If a sample mean must be greater than the other mean (Form 1), the right tail region is used (Fig. 1(c)), and the value of the test statistic must exceed the criterion value. If the situation is reversed, and the sample mean must be less than the other mean (Form 2), then the left tail region is used (Fig. 1(a)) and the criterion value must be less than the test statistic value. Only one of these last two forms is hypothesized and tested otherwise the two-sided comparison must be used. Figure 1. Hypothesis Test Rejection Criteria  2  s 0 One-Sided Test Statistic < Rejection Criterion (a) 687303317  2 s 0  0 s Two-Sided Test Statistic < -½ Rejection Criterion or Statistic > +½ Rejection Criterion (b) s One-Sided Test Statistic > Rejection Criterion (c) Page 2 of 8 Type I Errors occur when a test statistic leads us to reject the Null Hypothesis when the Null Hypothesis is true in reality. The chance of making a Type I Error is estimated by the parameter  (or level of significance), which quantifies the reasonable doubt. Type II Errors occur when a test statistic leads us to fail to reject the Null Hypothesis when the Null Hypothesis is actually false in reality. The probability of making a Type II Error is estimated by the parameter . Figure 2 depicts the errors associated with hypothesis testing. Depending on the situation, we can knowingly adjust our tests so that we can control the risks of these types of errors to a reasonable extent. H0 is the Null Hypothesis (always the default), HA is the Alternative Hypothesis H0: There is NO significant difference at the  level. HA: There IS a significant difference at the  level. Common  levels: .10, .05, .01 P-value: The smallest level of significance that would lead one to reject H0. It is possible to report the P-value for the test, and let the user decide if the significance is acceptable (rather than use a set value). H0 is True H0 is False H0 is True OK TYPE II ERROR  - risk (Keep H0 when H0 is False) Consumer's Risk H0 is False TYPE I ERROR  - risk (Reject H0 when H0 is True) Producer's Risk OK T E S T R E S U L REALITY T Figure 2. Hypothesis Testing & Error Types Sampling Distributions It can be difficult to associate an exact distribution with every possible type of event that we might be interested in predicting. Fortunately, most of the comparison hypotheses that we are interested in making can be fit to about four, well-known functions. These functions are referred to as sampling distributions: (1) the Normal Distribution; (2) the t–Distribution; (3) the Chi-Squared (2) Distribution; and (4) the F–Distribution. One of the factors that controls how well these sampling distributions approximate the actual distribution depends on the sample size or a statistic calculated from the sample size called the degrees of freedom (symbol ). Comparison of Means One of the first types of comparison that are important are those that compare the location of two distributions. When we do this, we compare the difference in the mean values for the two distributions, and check to see if the magnitude of their difference is sufficiently large relative to the amount of variation in the distributions. This is illustrated in Figure 3, below. Definitely Different Probably Different Probably NOT Different Figure 3. Comparison of Means, Illustrated. 687303317 Page 3 of 8 Definitely NOT Different Remembering that there will be a little variation in any sample taken from a distribution, the idea when comparing means it to express the distance between the two means in terms of the spread, and then see how likely we would be to observe a difference of that magnitude if there really were no difference in the two distributions (the Null Hypothesis). If the probability of that magnitude is sufficiently small, then we would reject the Null Hypothesis, and accept the Alternative Hypothesis instead. Assuming that the mean value from both distributions is normally distributed, we could always use a form of the t-test to perform the comparison. However, if we know a little bit more about the distributions, we can be far more efficient with the data, and more effective in detecting a true difference. Figure 5 shows a decision chart for two-way comparisons. The top portion of the chart shows the different ways that we could perform the test, based on our knowledge or prior experience with the processes. Comparison of Variances A second type of important comparison is when a difference in the spread of two distributions is suspected. To make this comparison, we compute the ratio of the two variances, and then compare the ratio to one of two known distributions as a check to see if the magnitude of that ratio is sufficiently unlikely for the distribution. This is illustrated in Figure 4, below. Definitely Different Probably Different Probably NOT Different Definitely NOT Different Figure 4. Comparison of Variances, Illustrated. Again, similar to our comparison of means, more experience with at least one of the processes will let us be more efficient in testing if there really is no difference in the two distributions (the Null Hypothesis). If the probability of that magnitude is sufficiently small, then we would reject the Null Hypothesis, and accept the Alternative Hypothesis instead. The bottom half of Figure 4 shows the decision criteria for two-way variance comparisons. For these last two tests, however, the assumption that the data comes from distributions that are normally distributed is very important. In fact, we should assess how normally our data are distributed prior to conducting either test. Testing for a Normal Distribution The idea behind the Normal Probability Plot is that if we structured a scatter plot of the sorted data values and the probability of encountering those data values, that the plot would look roughly linear if the true distribution were Normally distributed. Further, if two distributions are plotted on the same graph, since the slope of each data set on a Normal Probability Plot is proportional to the variance, parallel plots should have similar variances. Normal probability plots may be done on special paper, may be done manually, or can easily be done with Microsoft's Excel spreadsheet (an example of such a spreadsheet is available on disk). The construction and evaluation steps are outlined below: 687303317 Page 4 of 8 1. Record raw data and count the observations (obtaining n) 2. Set up a column of values (from 1 to j), where j is an index 3. Compute (zj) for each j value: (zj) = (j - 0.5)/n 4. Obtain a zj value for each (zj) from Standard Normal Table (find the entry ((zj) in the table, then read index value (zj), or use the Excel Norminv() function) 5. Set up a column of observed data, sorted into increasing values 6. Construct a scatter plot of zj values versus sorted data values 7. Approximate the slope with a sketched line at the 25% and 75% data points Evaluating a Normal Probability Plot 1. Assess the Equal-Variance and Normality assumptions: Data from a Normal sample should tend to fall along the line, so if a “fat pencil” covers almost all of the points, then a normality assumption is supported The slope of the line reflects the variance of the sample, so equal slopes support the equal variance assumption 2. Theoretically, a sketched line should intercept the zj = 0 axis at the mean value, so if the two plots intersect at that point, they should have the same mean 3. Practically: Close is good enough for comparing means Closer is better for comparing variances 4. If the slopes differ much for two samples, use a test that assumes the variances are not the same Summary of Hypothesis Testing Steps State the null hypothesis (H0) that test statistic . Choose an alternative hypothesis HA from one of the alternatives:  ,  , or . Choose a significance level of the test (. Select the appropriate test statistic and establish a critical region. (If the decision is to be based on a P-value, it is not necessary to state the critical region) 5. Compute the value of the test statistic () from the sample data. 6. Decision: Reject H0 if the test statistic has a value in the critical region (or if the computed P-value is less than or equal to the desired significance level ); otherwise, do not reject H0. 1. 2. 3. 4. 687303317 Page 5 of 8 x – 0 – 0 Figure 4. Decision Chart for Hypothesis Testing. Population Variance (2) is known (use z-test) Is the Population Mean known? Yes (use 0) z0 = 0 n No (use x2) x1 – x2 – 0 z0 = 12 22 + Means (z or t) What is known about the Variation? Population Variance (2) is unknown (use S2 & t-test) Is the Population Mean known? n1 Yes (use 0) x1 – 0 – 0 t0 = S n No Are the Variances similar? Kind of comparison ? n2 Yes (use pooled S2) x1 – x2 – 0 t0 = 1 Sp 1 + n1 No (use non-pooled S2) n2 x1 – x2 – 0 t0 = S12 S22 + n1 Paired Comparison (use difference in means & t-test) d – 0 t0 = Sd n Variances (2 or F) Is the Population Variance known? Yes (use 0 & 2-test) (n – 1) S2  = 2 02 0 No (use S2 & F-test) S12 F0 = S22 687303317 Page 6 of 8 n2 COMPARISON OF MEANS Condition Dist. Estimator 1 Estimator 2 Variance Known (σ2) Z Mean Known (µo) ------------ Mean Unknown (y) Test Statistic HA Criteria y - o   o Zo  Z          n   o   o Z o  Z y1  y 2 1   2 Zo  Z   12 1   2 1   2 Z o  Z  Zo  ------------ Zo  n1 Variance Unknown (S2) t Mean Known (µo) Mean Unknown (y) ------------ n2 y - o to  Similar Variances  12   22   22  S       n y1  y 2 to  1 1  n1 n 2 Sp Dissimilar Variances  12   22 to  Paired Comparison Difference In Means di  d1  d2  i 1 n i to   Sd   n  2 Z o  Z to  t , n-1 to  t , t o  t , v 2 Where S p    2  2  1   2   n1 n2    n 1  S     n     1 1 y1  y 2   t  , v  S p    n1 n 2  2      (n1  1) S12  (n 2  1) S 22 and v  n1  n 2  2 n1  n 2  2 1   2 t o  t , 1   2 t o  t , v v    S2 S2   y1  y2   t  , v  S p 1  2    n1 n2   2   2  S12 S 22      n1 n2   Where v   2 2  S12   S 22       n1        n2  n1  1 n2  1 to  t ,   S d    d  t  , n 1  n   2  n-1 2     Page 7 of 8  y  t ,   2       n   n 1 n Where S d  687303317  y1  y 2   Z    2 2 1   2 d  0 d Z o  Z t o  t , n 1 S12 S 22  n1 n 2 n  y Z   2 2   o   o y1  y 2 Sample Mean Of Differences d   o Confidence Interval  (d i  d )2 i 1 n 1 COMPARISON OF VARIANCES Condition Variance Known Dist.  ( ) Test Statistic  o2  2 o HA 2 (n  1) S  o2 2  2   o2 S 2 1 , S 22  F S Fo  S 2 1 2 2    2 2 , n 1 or  2 o Confidence Interval   2   o2  o2  12 ,n1       ,n1 2 Variance Unknown Criteria 2 o  12   22   2 1 2 o 2 o Fo  F 2 ,n1 1,n2 1  12   22  1 ,n1 1,n2 1 2 2 1, n 1 1 1 F , v2 , v1 NORMALITY ASSUMPTION IS CRITICAL 687303317 Page 8 of 8 (n  1) S 2 (n  1) S 2 , 2 2  2 , n 1   1 , n 1 2 or Fo  F Fo  F ,n1 1,n2 1 NOTE: F1 , v1 , v2   1 , n 1 2 2 Fo  F ,n 2 2 2 S12 S 22   S12 F    1 ,n2 1,n1 1  , S 2  2  2    F   ,n2 1,n1 1   2 

Hypothesis Testing

Related documents

Products

Support

Hypothesis Testing

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib