Estimation: Making Educated Guesses • Point Estimation • Interval Estimation • Hypothesis Testing Case Ia • Does a particular sample of observations in this study come from a specified population or does it represent a different population? – “Known” population mean – “Known” population standard deviation The 4th Grade Case • Suppose you are the superintendent of schools and you discover that the average reading achievement of your 4th graders has fallen far below that of previous years. One explanation posed by the teachers is that the district is faced with an unusually dull group of 4th graders this year. The teachers suggest that the average verbal IQ of this year’s 4th graders is far different from the national average and that is why reading achievement is so low. • You know that IQ-test scores don’t change much from year to year unless a school system is affected by changes in its attendance (e.g. a large migration of new families). Your school system has remained quite stable, but you decide to check out the teacher’s claim • You have a limited budget and while you have extensive achievement data on the 4th graders, you have limited IQ data available. So you decide to test a sample of 400 4th graders rather than all 5000 of them. The Logic of Hypothesis Testing • Null hypotheses (H0) • Alternative hypotheses (H1) • Is this backwards and convoluted or what? Hypothesis Testing: General Model • Identify the population and population parameter of interest • Define the null hypothesis and alternative hypothesis • Collect data on a random sample selected from population of interest • Compute a sample statistic that is an estimate of the parameter of interest • Decide on a criteria for evaluating the sample evidence • Make decision to retain the null hypothesis or discard the null hypothesis in favor of the alternative hypothesis Error and Risk The True Stat e of Reali ty De cis ion The Null Hypoth e iss is Tru e The Alternative Hypoth e iss is Tru e The Null Hypothesis is True The Alternative Hypoth e iss is True Correct D e cision Proba bility = 1- Type II Error (Risk b) Type I Error (Risk a) Correct D e cision Proba bility = 1-b L J Type I Error and Level of Significance • Type I error: the mistake of rejecting the null hypothesis (H0) when in fact it is true. • Level of Significance: – Alpha () = .05 – Significant at the .05 level – p < .05 Type II Error • Type II Error: If the alternative hypothesis (HA) is true and the decision maker decides to stick with the null hypothesis (H0) • Risk Hypothesis Testing: General Model • Identify the population and population parameter of interest • Define the null hypothesis and alternative hypothesis • Collect data on a random sample selected from population of interest • Compute a sample statistic that is an estimate of the parameter of interest • Decide on a criteria for evaluating the sample evidence • Make decision to retain the null hypothesis or discard the null hypothesis in favor of the alternative hypothesis Decision Rules • Decision Rule: the values of sample statistic that keep you believing H0 and the values that lead you to reject H0 Hypothetical Frequency Distribution of 1000 Samples .3413 .3413 .1359 68% .0214 .0214 95% .0013 97.75 .1359 .0013 99% 98.5 99.25 100 = population mean 100.75 101.5 102.25 “How Likely?” • How likely is this sample mean to arise by sampling error? • The “Sampling Distribution of Means” provides a model of what to expect if the null hypothesis is true Likely IQ = 100 Population of Scores Unlikely IQ = 100 Sampling Distribution of Means • By convention, an unlikely sample mean under the null hypothesis occurs 5 in 100 times (.05) or 1 in 100 times (.01) Selecting a Level of Significance: What is Unlikely? • Goal is to determine how consistent or inconsistent the sample data are with the null hypothesis • Usually select some small (conservative) level of significance (.05, .01, .001) • Level chosen depends on seriousness of the consequences of one’s decision Hypothetical Frequency Distribution of 1000 Samples 68% Unlikely at .05 Unlikely at .05 Unlikely at .01 Unlikely at .01 95% 99% 97.75 98.5 99.25 100 population mean 100.75 101.5 102.25 One- and Two-Tail Test? • One- and two-tail tests tell you which tail(s) in the sampling distribution of means should be used to determine “How likely?” • Two-Tail Test: Willing to Entertain a Sample Mean in Either Tail--H1 :Population Mean not = 100 • One-Tail Test: Willing to Specify the Direction of the Sample Mean (Above or Below the Population Mean Under the Null Hypothesis): H1 :Population Mean > 100 Two Tail .025 One Tail .025 .05 Critical Values for Case Ia: Z-Test Typ e of Alt erna t vi e Hypo theses L eve l of Signific a nce Dire c itona l "One-T a lie d" Non-Dire cti onal "T wo-T a lie d" Alph a (or p) =.05 1.65 1.96 Alph a (or p) =.01 2.33 2.58 Sampling Distribution of Means: Standard Errors, Critical Values, and Ps Z Distribution Normal Curve Two tailed Test -2se Critical Values P= -2.58se -1se -1.96se .05 = outisde of 1.96 either end u +1se +2se +1.96se .01 = outside of 2.58 either end +2.58se Sampling Distribution of Means: Standard Errors, Critical Values, and Ps Z Distribution Normal Curve -2se Critical Values P= One tailed test -1se u +1se +2se +1..65se .05 +2.33se .01 Sampling Distribution of Means: Standard Errors, Critical Values, and Ps Z Distribution Normal Curve One tailed test -2se -1se Critical Values -2.33se -1..65se P= .01 .05 u +1se +2se The Decision Regarding H0: The Lingo • Reject H0 : Take position that null hypothesis is probably false – – – – “H0 (the null hypothesis) was rejected” “A statistically significant finding was obtained” “A reliable difference was observed” “p is less than X” (a small decimal value (p<.05,p< .01)) • Fail-to-reject H0: Take the position that there is not enough evidence to reject the null hypothesis – “H0 was tenable” – “H0 was accepted” – “No reliable differences were observed” – “No significant differences were found” (ns) – “p is greater than X” (a small decimal value (p>.05,p> .01)) Significance Testing vs Hypothesis Testing • Hypothesis Testing: – Alpha level is preset – Decision is “reject” or “do not reject” – Don’t discuss impressive p-levels • Significance Testing – No alpha levels preset – Data speak through p-levels – Strength of significance discussed