Significance Tests • Hypothesis - Statement Regarding a Characteristic of a Variable or set of variables. Corresponds to population(s) – Majority of registered voters favor health care reform – Average salary progressions differ for male executives whose spouses work than for those whose spouses “stay at home” • Significance Test - Means of using sample statistics (and their sampling distributions) to compare their observed values with hypothesized value of corresponding parameter(s) Elements of Significance Test (I) • Assumptions – – – – Data Type: Quantitative vs. Qualitative Population Distribution: Some methods assume normal Sampling Plan: Simple Random Sampling Sample Size: Some methods have sample size requirements for validity • Hypotheses – Null Hypothesis (H0): A statement that parameter(s) take on specific value(s) (Often: “No effect”) – Alternative Hypothesis (Ha): A statement contradicting the parameter value(s) in the null hypothesis Elements of Significance Test (II) • Test Statistic: Quantity based on the sample data to test the null hypothesis. Typically is based on a sample statistic, parameter value under H0 , and the standard error. • P-value (P): The probability that we would obtain a test statistic at least as contradictory to the null hypothesis as our computed test statistic, if the null hypothesis is true. – Small P-values mean the sample data are not consistent with the parameter value(s) under H0 Elements of Significance Test (III) • Conclusion (Optional) – If the P-value is sufficiently small, we reject H0 in favor of Ha . The most widely accepted minimum level is 0.05, and the test is said to be significant at the .05 level. – If the P-value is not sufficiently small, we fail to reject (but not necessarily accept) the null hypothesis. – Process is analogous to American judicial system • H0: Defendant is innocent • Ha: Defendant is guilty Significance Test for Mean (Large-Sample) • Assumptions: Random sample with n 30, quantitative variable • Null Hypothesis: H0: m = m0 (typically no effect or change from standard) • Alternative Hypothesis: Ha: m m0 (2-sided alternative includes both > and <) • Test Statistic: zobs Y ^ m0 Y m0 Y • P-value: P=2P(Z |zobs|) s/ n Example - Mercury Levels • Population: Patients visiting private internal medicine clinic in S.F. (High-end fish consumers) • Variable: Mercury levels (microg/L) • Sample: 66 Females • Recommended maximum level: 5.0 microg/L • Null hypothesis: H0: m = 5.0 (Mean level=RML) • Alternative hypothesis: Ha: m 5.0 (Mean RML) • Sample Data: 15 15 Y 15 s 15 n 66 Y 1.85 66 8.12 ^ Example - Mercury Levels • Test Statistic: zobs Y m0 ^ Y 15 5 10 5.41 1.85 1.85 • P-Value: P=2P(Z 5.41) < 2P(Z 5.00) = 2(.000000287)= .000000574 0 • Conclusion: Very strong evidence that the population mean mercury level is above RML Source: Hightower and Moore (2003), “Mercury Levels in High-End Consumers of Fish, Environ Health Perspect, 111(4):A233 Miscellaneous Comments • Effect of sample size on P-values: For a given observed sample mean and standard deviation, the larger the sample size, the larger the test statistic and smaller the P-value (as long as the sample mean does not equal m0) • Equivalence between 2-tailed tests and confidence intervals: If a (1-a)100% CI for m contains m0, the P-value will be larger than a • 1-sided tests: Sometimes researchers have a specific direction in mind for alternative hypothesis prior to collecting data. Example - Crime Rates (1960-80) • Sample: n=74 Chicago Neighborhoods • Goal: Show the average delinquency rate in the population of all such neighborhoods has increased from 1960-1980 • Variable: Y = DR1980-DR1960 • H0: m = 0 (No change from 1960-1980) • Ha: m > 0 (Higher in 1980, see Y above) • Sample Data: 30.73 Y 41.26 s 30.73 n 74 Y 3.57 74 ^ Example - Crime Rates (1960-80) • Test Statistic: zobs Y m0 ^ Y 41.26 0 11.6 3.57 • P-value: (Only interested in larger positive values since 1-sided) P P(Z zobs ) P(Z 11.6) 0 • Conclusion: Strong evidence that the true mean delinquency rate among all neighborhoods that this sample was taken from has increased from 1960 to 1980. Source: Bursik and Grasmick (1993), “Economic Deprivation and Neighborhood Crime Rates, 19601980”, Law & Society Review, Vol. 27, pp 263-284 Significance Test for a Proportion (Large-Sample) • Assumptions: – Qualitative Variable – Random sample – Large sample: n 10/min(p0 , 1- p0) • Hypotheses: – Null hypothesis: H0: p p0 – Alternative hypothesis: Ha: p p0 (2-sided) – Ha+ : p > p0 Ha- : p < p0 (1-sided, prior to data) Significance Test for a Proportion (Large-Sample) • Test statistic: ^ zobs • P-value: ^ p p0 p p0 p 0 (1 p 0 ) / n ^ p – Ha: p p0 P = 2P(Z |zobs|) – Ha+ : p > p0 P = P(Z zobs) – Ha- : p < p0 P = P(Z zobs) • Conclusion: Similar to test for a mean Decisions in Tests a-level (aka significance level): Pre-specified “hurdle” for which one rejects H0 if the P-value falls below it. (Typically .05 or .01) P-Value .05 > .05 H0 Conclusion Reject Do not Reject Ha Conclusion Accept Do not Accept • Rejection Region: Values of the test statistic for which we reject the null hypothesis • For 2-sided tests with a = .05, we reject H0 if |zobs| 1.96 Error Types • Type I Error: Reject H0 when it is true • Type II Error: Do not reject H0 when it is false Test Result – Reject H0 Don’t Reject H0 True State H0 True Type I Error Correct H0 False Correct Type II Error Error Types • Probability of a Type I Error: a-Level (significance level) • Probability of a Type II Error: b - depends on the true level of the parameter (in the range of values under Ha ). • For a given sample size, and variability in data, the Type I and Type II error rates are inversely related • Conclusions wrt H0 are the same whether a hypothesis test of CI is conducted (fixed a) Miscellaneous Issues • Statistical vs Practical Significance: With very large sample sizes, we can often obtain very small P-values even when the sample quantity is very close to the parameter value under H0. Always consider the estimate as well as P-value. • While hypothesis tests and confidence intervals give similar conclusions wrt H0, the CI gives a credible set of parameter values, which can be more specific than test Small-sample Inference for m • t Distribution: – Population distribution for a variable is normal – Mean m, Standard Deviation – The t statistic has a sampling distribution that is called the t distribution with (n-1) degrees of freedom: t Y m ^ Y Y m s/ n • Symmetric, bell-shaped around 0 (like standard normal, z distribution) • Indexed by “degrees of freedom”, as they increase the distribution approaches z • Have heavier tails (more probability beyond same values) as z •Table B gives tA where P(t > tA) = A for degrees of freedom 1-29 and various A Small-Sample 95% CI for m • Random sample from a normal population distribution: ^ Y t.025,n1 Y s Y t.025,n 1 n • t.025,n-1 is the critical value leaving an upper tail area of .025 in the t distribution with n-1 degrees of freedom • For n 30, use z.025 = 1.96 as an approximation for t.025,n-1 t test for a mean • Assumptions: Random sample for a quantitative variable with a normal probability distribution • Hypotheses: – H0: m m0 • Test Statistic: Ha: m m0 (2-sided) tobs Y m0 ^ Y Y m0 s/ n • P-Value: 2P(t > |tobs|) • Conclusions as before, as well as 1-sided tests