Section 7.1 — Inference for population means, σ unknown Stat 226 – Introduction to Business Statistics I Inference for the population mean µ when the population standard deviation σ is unknown Spring 2009 Professor: Dr. Petrutza Caragea Section A Tuesdays and Thursdays 9:30-10:50 a.m. When the population standard deviation σ is unknown we have to estimate it first based on the collected data using the sample standard deviation s ! " n " 1 $ s=# (xi − x̄)2 n−1 i=1 Chapter 7, Section 7.1 Using s instead of σ adds more variability to the distribution of Introduction to Business Statistics I Section 6.1 √s , so we n need a distribution with heavier tails. Inference for population means, σ unknown Stat 226 (Spring 2009, Section A) x̄−µ The t-distribution accounts for the additional variation by having heavier tails (see graph next page) 1 / 12 Section 7.1 — Inference for population means, σ unknown Stat 226 (Spring 2009, Section A) Introduction to Business Statistics I Section 6.1 2 / 12 Section 7.1 — Inference for population means, σ unknown 0.4 0.3 µ (the mean) and σ (the standard deviation) 0.1 0.2 dnorm(data) 0.4 0.3 0.2 The t-distribution is characterized by a single parameter, the so-called 0.0 0.0 0.1 dnorm(data) Recall that the normal distribution is characterized by two parameters: !4 !2 0 2 4 !4 !2 4 0.4 0.3 0.1 0.0 !4 !2 0 2 4 !4 data Stat 226 (Spring 2009, Section A) “degrees of freedom” (short:df) As the degrees of freedom increase, the t-distribution approaches the standard normal distribution N(0, 1) 0.2 dnorm(data) 0.3 0.2 0.0 0.1 dnorm(data) 2 data 0.4 data 0 !2 0 2 4 Why? As the sample size increases, s estimates σ more accurately because we have more information about the population standard deviation in our sample. data Introduction to Business Statistics I Section 6.1 3 / 12 Stat 226 (Spring 2009, Section A) Introduction to Business Statistics I Section 6.1 4 / 12 Section 7.1 — Inference for population means, σ unknown Section 7.1 — Inference for population means, σ unknown 0.2 0.0 0.1 dt(data, 5) 0.3 Reading the t-table (Table D) !4 !2 0 2 4 data t-distribution is symmetric it is characterized by the degrees of freedom Stat 226 (Spring 2009, Section A) Introduction to Business Statistics I Section 6.1 5 / 12 Section 7.1 — Inference for population means, σ unknown Stat 226 (Spring 2009, Section A) Introduction to Business Statistics I Section 6.1 6 / 12 Section 7.1 — Inference for population means, σ unknown Finding critical values t ∗ for a t-distribution (table d) Example: 1 95% percentile of a t-distribution with df=5: 2 the 5% percentile for a t-distribution with df=15 3 Find the quantiles the bound the middle 95% t ∗ is the critical value such that the area to the right (uppertail probability) of t ∗ is equal to 0.05 (or 5%) 1 look down the “df column” (first column on left) to 5 2 at the top of the table, find the right tail (uppertail) probability of 0.05 3 the critical value t ∗ corresponds to where row and column intersect, which is t ∗ = 2.015 Note: t-table works with the area above (right) while z-table works with area below (left) Stat 226 (Spring 2009, Section A) Introduction to Business Statistics I Section 6.1 7 / 12 Stat 226 (Spring 2009, Section A) Introduction to Business Statistics I Section 6.1 8 / 12 Section 7.1 — Inference for population means, σ unknown Section 7.1 — Inference for population means, σ unknown Example: A random sample of 30 pills yielded a mean level of 20.5 mg of aspirin and a standard deviation of 1.5 mg. Find a 95% confidence interval for the mean level µ of aspirin in a pill. confidence intervals for µ when σ is unknown A (1 − α) · 100% confidence interval for µ is given by % & s x̄ ± t ∗ · √ n just change σ to s and z ∗ to t ∗ look up t ∗ corresponding to a t-distribution with df = n − 1 Stat 226 (Spring 2009, Section A) Introduction to Business Statistics I Section 6.1 9 / 12 Section 7.1 — Inference for population means, σ unknown Stat 226 (Spring 2009, Section A) Introduction to Business Statistics I Section 6.1 10 / 12 Section 7.1 — Inference for population means, σ unknown What about assumptions for the t-test? simple random sample (ensuring independence of observations) Handout (t-test examples) data following a normal distribution or sufficiently large sample size for the CLT to apply Stat 226 (Spring 2009, Section A) Introduction to Business Statistics I Section 6.1 11 / 12 Stat 226 (Spring 2009, Section A) Introduction to Business Statistics I Section 6.1 12 / 12