The Statistical Imagination Chapter 9. Hypothesis Testing I: The Six Steps of Statistical Inference © 2008 McGraw-Hill Higher Education A Hypothesis • A hypothesis is a prediction about the relationship between two variables that asserts that differences among the measurements of an independent variable will correspond to differences among the measurements of a dependent variable © 2008 McGraw-Hill Higher Education Using a Hypothesis to Test a Theory • The hypothesis is stated before we gather data • The theoretical purpose of a hypothesis test is to corroborate theory by testing preconceived ideas against facts • Theory motivates or pushes us to expect certain empirical outcomes © 2008 McGraw-Hill Higher Education Statistical Inference • Statistical inference is drawing conclusions about a population on the basis of sample statistics • The logic of hypothesis testing involves deciding whether to accept or reject a statement on the basis of observations of data • “Accounting for sampling error” with sampling distributions is key to the process © 2008 McGraw-Hill Higher Education The Test Effect of the Hypothesis Test • The difference between what is observed in a sample and what is hypothesized is called the test effect • We ask: What is the probability that the test effect is simply the result of sampling error? • The sampling distribution provides a measuring stick to answer the question © 2008 McGraw-Hill Higher Education The Statistical Purpose of a Hypothesis Test • The statistical purpose of a hypothesis test is to determine whether statistical test effects computed from a sample indicate (1) real effects in the population or (2) sampling error © 2008 McGraw-Hill Higher Education Making Empirical Predictions • To prove a hypothesis we must predict two things: 1. A mathematical prediction of a parameter outcome 2. A sampling distribution, a prediction of all possible sampling outcomes factoring in sampling error • With these predictions, we determine the probability that our single sample outcome differs significantly from the predicted outcome (i.e., if the effect is real) © 2008 McGraw-Hill Higher Education Basic Logical Procedure of a Hypothesis Testing a. A question is raised b. Predictions are made on the basis of probability theory c. An event is observed and its effects are measured d. The probability of the test effect occurring is computed e. A conclusion is drawn © 2008 McGraw-Hill Higher Education Two Sets of Tasks When Testing a Hypothesis 1. Test preparation — deciding what test to use, and organizing the data 2. Test the hypothesis following the six steps of statistical inference © 2008 McGraw-Hill Higher Education Test Preparation • State the research question: A goal that can be stated in terms of a hypothesis • Draw a conceptual diagram depicting givens (population under study, sample size, variables and their levels of measurement, provided and calculated parameters and statistics) • Select the statistical test © 2008 McGraw-Hill Higher Education Step 1 of The Six Steps of Statistical Inference • Step 1: State the null hypothesis (H0). State the alternative hypothesis (HA) and stipulate the direction of the test © 2008 McGraw-Hill Higher Education The Null Hypothesis • The null hypothesis, H0, is a hypothesis stated in such a way that we will know what statistical outcomes will occur in repeated random sampling if this hypothesis is true • It is a “statistical” hypothesis: it directs us to the sampling distribution, which provides sampling predictions © 2008 McGraw-Hill Higher Education What does “null” mean? • Null means none • H0 predicts sampling outcomes assuming no effect or no difference • Often we can “nullify” (negate the wording) of the research question to determine the H0 © 2008 McGraw-Hill Higher Education The Alternative Hypothesis (HA) • HA is the statement we accept if H0 is rejected • HA is often a direct statement of the research question © 2008 McGraw-Hill Higher Education The Direction of a Hypothesis Test • Test direction refers to whether we are able to predict the direction our observed sample statistic will fall • Direction must be specified before we observe data © 2008 McGraw-Hill Higher Education The Direction of a Hypothesis Test (cont.) • Three possible directional statements: 1. Nondirectional (two-tailed test) 2. Positive direction (one-tailed test) 3. Negative direction (one-tailed test) © 2008 McGraw-Hill Higher Education The Direction of a Hypothesis Test (cont.) • In Step 1, the HA, we specify whether we expect the outcome in our observed sample to fall above (positive, one-tailed) or below (negative, one-tailed) the hypothesized parameter of the H0 • For a nondirectional, two-tailed test, we do not predict a direction and simply assert that the outcome is expected to differ from the hypothesized parameter © 2008 McGraw-Hill Higher Education When to State a Positive, One-Tailed Test • When the content of research question includes terms such as greater than, more, increase, faster, heavier, and gain © 2008 McGraw-Hill Higher Education When to State a Negative, One-Tailed Test • When the content of research question includes terms such as less than, fewer, decrease, slower, lighter, and loss © 2008 McGraw-Hill Higher Education When to State a Nondirectional, Two-Tailed Test • When the content of the research question includes no statements about direction, or simply asserts inequality © 2008 McGraw-Hill Higher Education Step 2 of The Six Steps of Statistical Inference • Step 2: Describe the sampling distribution and draw its curve • The sampling distribution is a description of all possible sampling outcomes and a stipulation of the probability of each outcome assuming that the H0 is true • It is built around the H0 © 2008 McGraw-Hill Higher Education Step 3 of The Six Steps of Statistical Inference • Step 3: State the chosen level of significance, alpha (α), and indicate again whether the test is one-tailed or two-tailed. Specify the critical test value • The level of significance, alpha, is the amount of sampling error we are willing to tolerate in coming to a conclusion • Critical test values are obtained from the statistical tables in Appendix B © 2008 McGraw-Hill Higher Education Step 4 of The Six Steps of Statistical Inference • Step 4: Observe the actual sample; compute the test effects, the test statistic, and the p-value © 2008 McGraw-Hill Higher Education Step 4 (cont.): The Test Effect • The test effect is the difference between the value of the sample statistic and the parameter value predicted by the null hypothesis (H0 in Step 1) • It is a deviation score on the sampling distribution curve © 2008 McGraw-Hill Higher Education Step 4 (cont.): The Test Statistic • The test statistic is a formula for measuring the likelihood of the observed effect • It transforms the effect into standard error units so that the result may be compared to critical scores of the statistical tables in Appendix B © 2008 McGraw-Hill Higher Education Step 4 (cont.) The p-Value • The p-value is a measure of the unusualness of a sample outcome when the H0 is true. E.g., Is it unusual to roll four 7’s in a row with honest dice? • Calculation: p-value = probability (p) of sampling outcomes as unusual as or more unusual than the outcome observed under the assumption that the H0 is true • An area in the tail(s) of the curve in Step 2 © 2008 McGraw-Hill Higher Education Step 5 of The Six Steps of Statistical Inference • Step 5: Make the rejection decision by comparing the p-value to α • If p < α, reject the H0 and accept the HA at the 1- α level of confidence • If p > α, “fail to reject” the H0 © 2008 McGraw-Hill Higher Education Step 6 of The Six Steps: Interpretation • Step 6: Interpret and apply the results, and provide best estimates in everyday terms • Fit the interpretation to either a professional or public audience: use as little statistical jargon as possible • Frame the interpretation around the H0 or the HA, whichever survived the hypothesis test © 2008 McGraw-Hill Higher Education Probability Theory in Hypothesis Testing • Computing probabilities is the essential mathematical operation in hypothesis testing • Hypothesis testing is based on comparing two probabilities: 1. What actually occurs in our single observed sample 2. What we expect to occur in repeated sampling © 2008 McGraw-Hill Higher Education A Focus on p-Values: When the p-Value is Large • When p > α, we fail to reject the H0 • A large p-value tells us that our observed sample outcome is not much different or “far off” from the outcome predicted by the H0 • A large p-value occurs when the test effect is small, and this suggests that the effect could easily be the result of expected sampling error © 2008 McGraw-Hill Higher Education A Focus on p-Values: When the p-Value is Small • When p < α, we reject the H0 • A small p-value tells us that assuming the H0 is true, our sample outcome is unusual or “far off” from the outcome predicted by the H0 • A small p-value occurs when the test effect is large leading us to conclude that the test effect did not result from sampling error © 2008 McGraw-Hill Higher Education Inverse Relationship Between Effect Size and p-Value • A small test effect = a large p-value = “fail to reject” the H0 • A large test effect = a small p-value = “reject” the H0 and accept the HA © 2008 McGraw-Hill Higher Education The Level of Significance (α) in Hypothesis Testing • The level of significance (α) is the critical probability point at which we are no longer willing to say that our sampling outcome resulted from random sampling error • α is stated in Step 3 and compared in Step 5 to the p-value. This comparison is called the rejection decision © 2008 McGraw-Hill Higher Education Critical Test Scores • The critical test score (Zα) is the statistical test score that is large enough to indicate a significant difference between the observed sample statistic and the hypothesized parameter • The critical region is the area in the tail(s) of the probability curve that is beyond the critical test score of the stated level of significance © 2008 McGraw-Hill Higher Education Critical Test Scores (cont.) • Zobserved is a test statistic • Zα is a critical score • If │Zobserved │ > │ Zα │, then p < α; reject H0 • If │Zobserved │ < │ Zα │, then p > α; fail to reject H0 © 2008 McGraw-Hill Higher Education Critical Z-scores on the Normal Curve • Critical Z-scores are ones of great importance in statistical procedures and are used very frequently • Some widely used critical Z-scores are 1.64, 1.96, 2.33, 2.58, 3.08, and 3.30 • See if you can match these scores to the level of significance and direction of a hypothesis test © 2008 McGraw-Hill Higher Education Choosing the Level of Significance • Setting the level of significance (α) allows us to control the chances of making a wrong decision or “error” • Short of double-checking against data for the entire population, we will never know for sure whether we made the correct rejection decision or made an error © 2008 McGraw-Hill Higher Education Possible Results of a Rejection Decision • Correct decision: Fail to reject a true H0 • Type I error: Rejecting a true H0 • Correct decision: reject a false H0 • Type II error: Failing to reject a false H0 © 2008 McGraw-Hill Higher Education Managing and Controlling Rejection Decision Errors • When we reject H0, we either made a correct decision or made a Type I error; we could not have made a Type II error • When we fail to reject H0, we made either a correct decision or a Type II error; we could not have made a Type I error © 2008 McGraw-Hill Higher Education Controlling Type I and Type II Errors • Type I error is easily controlled by setting the level of significance (α), because it turns out that α = p [of making a Type I error] • β = p [of making a Type II error]; controlling beta (β) is difficult • β is indirectly controlled when we set α because the two are inversely related; β is also minimized by using a large sample size © 2008 McGraw-Hill Higher Education Four Conventional Levels of Alpha (α ) • α =.10: High likelihood of rejecting the H0. Used in exploratory research, where little is known about a topic • α =.05: Moderate likelihood of rejecting the H0. Used in survey research • α =.01 and α =.001: Low likelihood of rejecting the H0. Used in biological, laboratory, and medical research, especially when a Type I error is lifethreatening © 2008 McGraw-Hill Higher Education The Level of Confidence (LOC) for a Hypothesis Test • The LOC is the confidence we have that we did not make a Type I error • LOC = 1 - level of significance = 1 - α • E.g., the .05 level of significance corresponds to a 95% LOC • The only time we have 100% confidence in a conclusion is when every subject in a population is observed © 2008 McGraw-Hill Higher Education Selecting Which Statistical Test to Use • Ask: How many variables are we observing for this test? • What are the levels of measurement of the variables? • Are we dealing with one representative sample from a single population or more? • What is the sample size? • Are there peculiar circumstances to consider? © 2008 McGraw-Hill Higher Education When to Use a Large Single-Sample Means Test 1. One variable 2. Interval/ratio level of measurement 3. One representative sample from one population 4. n > 121 cases Sampling distribution will be the normal curve (See Chapter 7) © 2008 McGraw-Hill Higher Education