Chapter 2: Inference Using t-Distributions 2.3 A t-Ratio for Two-sample Inference: Scenario: Two independent samples from two normally distributed populations. • Where does the probability model arise from? • Why doesn’t the Schizophrenia example fit into this scenario? • What sampling distribution are we interested in? • Facts about the sampling distribution of Y 2 − Y 1 from statistical theory: 1. Center: centered on the difference between the population means 2. Shape: more nearly normal than the shape of the population distributions 3. Spread: v ! u 2 u σ2 σ 1 2 SD(Y 2 − Y 1 ) = t + n1 n2 • STANDARD ERROR (SE) for the Difference of Two Averages: – FACT: Comparing means is reasonable only if all other features of the two distributions are similar. – Therefore, start by assuming the two populations have equal SDs (σ1 = σ2 = σ). – Pooled SD (sp ): ∗ If the population SDs are equal, it makes sense to pool (combine) the sample SDs to get one estimate ∗ Use a weighted average of the sample variances (weight = d.f.) ∗ Pooled estimate of SD (sp ): s sp = (n1 − 1)s21 + (n2 − 1)s22 where d.f. = n1 + n2 − 2 (n1 + n2 − 2) – Standard Error for the Difference: s SE(Y 2 − Y 1 ) = sp 1 1 + where d.f. = n1 + n2 − 2 n1 n2 • Confidence Interval (CI) for the difference between population means: – Parameter of interest: – Estimate: 1 – Standard error of the estimate: – d.f. of the standard error: – Form a t-ratio: – Distribution of the t-ratio: – 100(1 − α)% CI: – What factors affect the width of a confidence interval? 1. 2. 3. • Testing a hypothesis about the difference between population means: – Form a t-ratio supposing that the null hypothesis is true (so that we can enter a numerical value for the parameter) −→ We now call it a t-statistic. – Use the t-distribution to evaluate whether the t-statistic (that you calculate from your data) is a likely value for a t-ratio if the null hypothesis is true. – Calculating the t-statistic: – What does the t-statistic tell us? – The p-value for a t-test is the probability of obtaining a t-ratio as extreme or more extreme than the observed t-statistic (it’s evidence against the null hypothesis), if the null hypothesis is correct. ∗ Where does the probability model come from that allows us to calculate a pvalue? ∗ If a p-value is small, there are two possibilities: 1. 2. ∗ How do we know which of the above is true? ∗ The the p-value, the is the evidence that the null hypothesis is incorrect. ∗ A large p-value =⇒ study is not capable of excluding the null hypothesis as a possible explanation. (CANNOT say the null hypothesis is true!) Possible wording: “the data are consistent with the hypothesis being true.” ∗ One-sided vs. Two-sided p-values: 2 ∗ Depends on how specific the researcher can pinpoint the alternative to the null hypothesis. ∗ Most important: Always report whether the p-value is one-sided or two-sided! ∗ The mechanics of p-value computation using the t-distribution: 2.4 Inferences in a Two-Treatment Randomized Experiment Scenario: Randomization used to assign units to two groups. • Where does the probability model arise from? • Can we still use the t-distribution? – p-values and confidence intervals based on the t-distribution are approximations to the correct values calculated from a randomization distribution. ∗ Compare the results from the Creativity and Motivation Case study when using t-tools vs. approximate randomization distribution: • Hypothesis tests: – Calculations are the same as for random sampling situations, but the conclusions are phrased differently. – For randomized experiments, we now phrase conclusions in terms of treatment effects and causation, instead of differences in population means and association. – Test if δ = 0 rather than if (µ1 − µ2 ) = 0. • Confidence interval for a treatment effect: – Based on t-distribution approximation: Calculations are the same as for the difference b/t population means. – Based on randomization distribution: ∗ Use the relationship between a confidence interval and a p-value ∗ RULE: Any hypothesized parameter value should be included or excluded from a 100(1 − α)% confidence interval according to whether its test yields a two-sided p-value that is greater than or less than α. ∗ How do we implement the rule to make a 95% CI using trial-and-error? 1. Calculate the p-value for testing δ = c (c is some hypothesized value for the treatment effect) 2. If the two-sided p-value is ≥ 0.05, then c is included in the CI (i.e. it is considered a plausible value for δ) 3 2.5 Related Issues Interpretation of p-values • How small is small? • It is difficult and unwise to decide on absolute cutoff points that can be applied to any situation. • A p-value is NOT the probability of the null hypothesis being correct. The probability arises from uncertainty in the data and not uncertainty in the parameter value. • Comments on the Rejection Region approach: – What is the difference between a p-value of .049 and .051 in terms of degree of evidence against the null hypothesis? – What about .048 and .0001? – Why is it important to report all p-values? – From Display 2.12, which p-values are convincing? moderate? suggestive? not convincing? Confidence intervals and culmination of evidence: • Einstein’s general relativity theory example: • One moral of the story: Theories must withstand continual challenges from skeptical scientists! – Study results are typically uncertain. – The fact that intervals based on some data fail to include the true value does not disprove general relativity! Theories become inadequate when a theory’s predictions are consistently denied. 4