UWHC Scholarly Forum April 17, 2013 Ismor Fischer, Ph.D. UW Dept of Statistics, UW Dept of Biostatistics and Medical Informatics ifischer@wisc.edu UWHC Scholarly Forum April 17, 2013 Ismor Fischer, Ph.D. UW Dept of Statistics, UW Dept of Biostatistics and Medical Informatics ifischer@wisc.edu All slides posted at http://www.stat.wisc.edu/~ifischer/UWHC • Click on image for full .pdf article • Links in article to access datasets “Statistical Inference” POPULATION Study Question: Has “Mean (i.e., average) Age at First Birth” of women in the U.S. changed since 2010 (25.4 yrs old)? Present Day: Assume “Mean Age at First Birth” follows a normal distribution (i.e., “bell curve”) in the population. ~ The Normal Distribution ~ “population standard deviation” f ( x) symmetric about its mean unimodal (i.e., one peak), with left and right “tails” models many (but not all) naturally-occurring systems useful mathematical properties… “population mean” Example: Body Temp (°F) low variability small 98.6 ~ The Normal Distribution ~ “population standard deviation” f ( x) symmetric about its mean unimodal (i.e., one peak), with left and right “tails” models many (but not all) naturally-occurring systems useful mathematical properties… “population mean” IQ score Example: Body Temp (°F) low high variability small large 98.6 100 ~ The Normal Distribution ~ “population standard deviation” 95% 2.5% ≈2σ 2.5% ≈2σ f ( x) symmetric about its mean unimodal (i.e., one peak), with left and right “tails” models many (but not all) naturally-occurring systems useful mathematical properties… “population mean” Approximately 95% of the population values are contained between – 2σ and + 2 σ. 95% is called the confidence level. 5% is called the significance level. POPULATION Study Question: Has “Mean (i.e., average) Age at First Birth” of women in the U.S. changed since 2010 (25.4 yrs old)? “Statistical Inference” via… “Hypothesis Testing” Present Day: Assume “Mean Age at First Birth” follows a normal distribution (i.e., “bell curve”) in the population. cannot be found with 100% certainty, but can be estimated with high confidence (e.g., 95%). H0: pop mean age = 25.4 (i.e., no change since 2010) “Null Hypothesis” POPULATION Study Question: Has “Mean (i.e., average) Age at First Birth” of women in the U.S. changed since 2010 (25.4 yrs old)? “Statistical Inference” via… “Hypothesis Testing” Present Day: Assume “Mean Age at First Birth” follows a normal distribution (i.e., “bell curve”) in the population. T-test x2 “Null Hypothesis” x4 x1 x3 x5 … etc… x400 H0: pop mean age = 25.4 (i.e., no change since 2010) FORMULA sample mean age x 25.6 x1 x2 x n xn Do the data tend to support or refute the null hypothesis? Is the difference STATISTICALLY SIGNIFICANT, at the 5% level? ~ The Normal Distribution ~ CENTRAL LIMIT THEOREM n x2 x1 x3 x4 x5 … etc… ~ The Normal Distribution ~ 95% 2.5% ≈2σ 2.5% n ≈2σ Approximately 95% of the population values are contained between – 2 σ and + 2 σ. Approximately 95% of the sample mean values are contained between 2 n and 2 n Approximately 95% of the intervals x 2 n from x 2 n to contain , and approx 5% do not. Approximately 95% of the intervals x 2 n from x 2 n to contain , and approx 5% do not. 2 n 95% margin of error 2 n POPULATION Study Question: Has “Mean (i.e., average) Age at First Birth” of women in the U.S. changed since 2010 (25.4 yrs old)? “Statistical Inference” via… “Hypothesis Testing” Present Day: Assume “Mean Age at First Birth” follows a normal distribution (i.e., “bell curve”) in the population. “Null Hypothesis” H0: pop mean age = 25.4 (i.e., no change since 2010) FORMULA SAMPLE n = 400 ages sample mean x x4 x1 x2 x3 x5 … etc… x400 x1 x2 n xn = 25.6 Approximately 95% of the intervals x 2 n from x 2 n to contain , and approx 5% do not. PROBLEM! 95% margin of error σ is unknown the vast 2 majority of the time! n POPULATION Study Question: Has “Mean (i.e., average) Age at First Birth” of women in the U.S. changed since 2010 (25.4 yrs old)? “Statistical Inference” via… “Hypothesis Testing” Present Day: Assume “Mean Age at First Birth” follows a normal distribution (i.e., “bell curve”) in the population. “Null Hypothesis” H0: pop mean age = 25.4 (i.e., no change since 2010) FORMULA SAMPLE n = 400 ages sample mean x x4 x1 x2 x3 x5 … etc… x400 x1 x2 n xn = 25.6 sample variance = modified average of the squared deviations from the mean sample standard deviation 95% margin of error 2 n “Statistical Inference” via… “Hypothesis Testing” POPULATION Study Question: Has “Mean (i.e., average) Age at First Birth” of women in the U.S. changed since 2010 (25.4 yrs old)? Present Day: Assume “Mean Age at First Birth” follows a normal distribution (i.e., “bell curve”) in the population. “Null Hypothesis” H0: pop mean age = 25.4 (i.e., no change since 2010) FORMULA SAMPLE n = 400 ages sample mean x x4 x1 x2 x1 x2 n xn = 25.6 sample variance x3 x5 … etc… x400 ( x1 x ) 2 ( x2 x ) 2 s n 1 ( xn x ) 2 2 sample standard deviation s s = 1.6 2 95% margin of error 2 n 2 s = 0.16 n Approximately 95% of the intervals x 2 n from x 2 n to contain , and approx 5% do not. x = 25.6 95% margin of error 2 25.44 s = 0.16 n 2 x = 25.6 s = 0.16 n 25.76 BASED ON OUR SAMPLE DATA, the true value of μ today is between 25.44 and 25.76 years, with 95% “confidence” (…akin to “probability”). Two main ways to conduct a formal hypothesis test: 95% CONFIDENCE INTERVAL FOR µ = 25.4 25.44 x = 25.6 25.76 BASED ON OUR SAMPLE DATA, the true value of μ today is between 25.44 and 25.76 years, with 95% “confidence” (…akin to “probability”). IF H0 is true, then we would expect a random sample mean x that is at least 0.2 years away from = 25.4 (as ours was), to occur with probability 1.24%. “P-VALUE” of our sample Very informally, the p-value of a sample is the probability (hence a number between 0 and 1) that it “agrees” with the null hypothesis. Hence a very small p-value indicates strong evidence against the null hypothesis. The smaller the p-value, the stronger the evidence, and the more “statistically significant” the finding. Two main ways to conduct a formal 95% CONFIDENCE INTERVAL FOR µ hypothesis test: CONCLUSIONS: FORMAL The 95% confidence interval corresponding to our sample mean does not =value” 25.4 of25.44 x = 25.6 contain the “null the population mean, μ = 25.4 years. 25.76 The p-value ourSAMPLE sample,DATA, .0124,the is less predetermined α = .05 BASED ON of OUR truethan valuethe of μ today is between significance 25.44 andlevel. 25.76 years, with 95% “confidence” (…akin to “probability”). Based on our sample data, we may (moderately) reject the null hypothesis H0: μ = 25.4 in favor of the two-sided alternative hypothesis HA: μ ≠ 25.4, at the significance level. expect a random sample mean x that is at least IF Hα0 =is .05 true, then we would 0.2 years away from = 25.4 (as ours was), to occur with probability 1.24%. INTERPRETATION: According to the results of this study, there exists a our sample statistically significant difference between the mean “P-VALUE” ages at firstofbirth in 2010 (25.4 years old) and today, at the 5% significance level. Moreover, the informally, p-value of a that sample the probability evidence from Very the sample datathe would suggest theispopulation mean(hence age a number between and 1)rather that itthan “agrees” with theyounger. null hypothesis. today is significantly older than in0 2010, significantly Hence a very small p-value indicates strong evidence against the null hypothesis. The smaller the p-value, the stronger the evidence, and the more “statistically significant” the finding. POPULATION Study Question: Has “Mean (i.e., average) Age at First Birth” of women in the U.S. changed since 2010 (25.4 yrs old)? “Statistical Inference” via… “Hypothesis Testing” Present Day: Assume “Mean Age at First Birth” follows a normal distribution (i.e., “bell curve”) in the population. T-test x2 “Null Hypothesis” x4 x1 x3 x5 … etc… x400 H0: pop mean age = 25.4 (i.e., no change since 2010) FORMULA sample mean age x 25.6 x1 x2 x n xn Do the data tend to support or refute the null hypothesis? Is the difference STATISTICALLY SIGNIFICANT, at the 5% level? POPULATION Study Question: Has “Mean (i.e., average) Age at First Birth” of women in the U.S. changed since 2010 (25.4 yrs old)? “Statistical Inference” via… “Hypothesis Testing” Present Day: Assume “Mean Age at First Birth” follows a normal distribution (i.e., “bell curve”) in the population. T-test H0: pop mean age = 25.4 (i.e., no change since 2010) “Null Hypothesis” Check? The reasonableness of the normality assumption is empirically verifiable, and in fact formally testable from the sample data. If violated (e.g., skewed) or inconclusive (e.g., small sample size), then “distribution-free” nonparametric tests can be used instead of the T-test. Examples: Sign Test, Wilcoxon Signed Rank Test (= Mann-Whitney Test) POPULATION Study Question: Has “Mean (i.e., average) Age at First Birth” of women in the U.S. changed since 2010 (25.4 yrs old)? “Statistical Inference” via… “Hypothesis Testing” Present Day: Assume “Mean Age at First Birth” follows a normal distribution (i.e., “bell curve”) in the population. T-test x2 “Null Hypothesis” x4 x1 H0: pop mean age = 25.4 (i.e., no change since 2010) x3 x5 … etc… x400 Sample size n partially depends on the power of the test, i.e., the desired probability of correctly rejecting a false null hypothesis. HOWEVER…… ~ The Normal Distribution ~ “population standard deviation” 95% 2.5% ≈2σ 2.5% ≈2σ “population mean” x2 x1 x3 x4 x5 … etc… Approximately 95% of the population values are contained between – 2 σ and + 2 σ. Approximately 95% of the sample mean values are contained between 2 n and 2 n Approximately 95% of the intervals x 2 n from x 2 n to contain , and approx 5% do not. ~ The Normal Distribution ~ “population standard deviation” 95% 2.5% ≈2σ 2.5% ≈2σ “population mean” x2 x1 x3 x4 x5 … etc… Approximately 95% of the population values are contained between – 2 s and + 2 s. Approximately 95% of the sample mean values are contained between 2 s n and 2 s n Approximately 95% of the intervals x 2 s n from x 2 s n to contain , and approx 5% do not. …IF n is large, 30 traditionally. But if n is small… … this “T-score" increases (from ≈ 2 to a max of 12.706 for a 95% confidence level) as n decreases larger margin of error less power to reject. If n is small, T-score > 2. If n is large, T-score ≈ 2.