10.2. Population Mean: Small Sample Case t-distribution (“student” t distribution): This distribution was invented by W. S. Gossett (published in the name “student”). The t-distribution is a family of similar probability distributions, with a specific t distribution depending on a parameter known the degree of freedom (d.f.). Denote T (n) to be the random variable having t-distribution with n degree of freedom. Example: P (T (1) > 3.078) = y ⇔ y = 0.1 ; P(T (1) > 6.314) = y ⇔ y = 0.05 P(T (5) > x) = 0.025 ⇔ x = 2.571, P (T (11) > x) = 0.01 ⇔ x = 2.718. P(T (∞) > 1.96) = 0.025 = P( Z > 1.96) Note: T (∞ ) ~ N ( 0 ,1) . The t-distribution is symmetric about 0. t n ,α α : a t value with an area of 2 2 in the upper tail of the t-distribution with degree of freedom equal to n. α ⇔ P ⎛⎜ T ( n ) > t n ,α ⎞⎟ = . 2 ⎠ ⎝ 2 Example: P(T (1) > 3.028) = 0.1 = P(T (1) > t1,0.1 ) ⇔ t1,0.1 = 3.078 P(T (5) > 2.571) = 0.025 = P(T (5) > t5,0.025) ⇔ t5,0.025 = 2.571 Example: For a t distribution with 16 degrees of freedom, find the area of probability. (a) To the left of -1.746. (b) Between -1.337 and 2.120. [solution:] (a) 1 P(T (16) < −1.746) = P(T (16) > 1.746) = 0.05. (b) P(−1.337< T(16) < 2.120) =1− P(T(16) > 2.12) − P(T(16) < −1.337) = 1− 0.025− P(T(16) > 1.337) = 1− 0.025− 0.1 = 0.875 Important Result: When the population has a normal probability distribution and σ is unknown, then X − μ = T ( n − 1) S n n ∑X ∑ (X n i − X) , 2 i , and X1, X2 ,K, Xn variables with associated possible values x1 , x2 ,K, xn . where X = i =1 n , S2 = i =1 n −1 are random Derivation of (1 − α ) × 100% confidence interval: Suppose the population has a normal probability distribution. Since P⎛⎜ T (n − 1) ≤ t n−1,α ⎞⎟ = 1 − P⎛⎜ T (n − 1) > t n−1,α ⎞⎟ 2⎠ 2⎠ ⎝ ⎝ = 1 − 2P⎛⎜T (n − 1) > t n−1,α ⎞⎟ (QT (n − 1) is symmetric) 2⎠ ⎝ = 1−α thus 2 ⎛ ⎞ ⎛ ⎞ ⎜ X −μ ⎟ ⎜ μ−X ⎟ ≤ t n−1,α ⎟ = P⎜ ≤ t n−1,α ⎟ P⎛⎜ T (n −1) ≤ t n−1,α ⎞⎟ = P⎜ 2⎠ 2 2 ⎝ ⎜ S ⎟ ⎜ S ⎟ n n ⎝ ⎠ ⎝ ⎠ ⎛ ⎞ ⎜ ⎟ ⎛ u−X S S ⎞ = P⎜ − t n−1,α ≤ ≤ t n−1,α ⎟ = P⎜⎜ − t n−1,α ≤ u − X ≤ t n−1,α ⎟⎟ S 2 2 2 2 n n ⎝ ⎠ ⎜ ⎟ n ⎝ ⎠ ⎛ S S ⎞ = P⎜⎜ X − t n−1,α ≤ μ ≤ X + t n−1,α ⎟⎟ = 1 − α 2 2 n n⎠ ⎝ (1 − α ) × 100 % confidence interval based on t-distribution: σ As the population has a normal distribution and x ± tn−1,α 2 is unknown, s ⎡ s s⎤ ≡ ⎢x −tn−1,α , x + tn−1,α ⎥ 2 n 2 n⎦ n ⎣ is a (1 − α ) × 100 % confidence interval of μ . Example: Consider the following random sample of 4 observations, 25, 47, 32, 56. Suppose the population is normally distributed. Please construct a 95% confidence interval for μ . [solution:] 25 + 47 + 32 + 56 x= = 40, s = 4 (25 − 40)2 + (47 − 40)2 + (32 − 40)2 + (56 − 40)2 4 −1 = 14.07 In addition, α = 0.05 and t 3,0.025 = 3.182 . Thus, x ± t 3,0.025 14.07 14.07 ⎤ ⎡ = ⎢40 − 3.182 , 40 + 3.182 ⎥ n 4 4 4 ⎦ ⎣ = [17.613, 62.387] s = 40 ± 3.182 14.07 3 Example: Suppose we have the following data from a normal population 50 48 55 52 53 46 54 50 Provide a 95% confidence interval for the population mean. [solution:] n = 8, α = 0.05, x = 51, s = 3.07. x ± t n−1,α s 2 n = x ± t 7,0.025 3.07 8 Then, a 95% confidence interval is = 51 ± 2.365 Online Exercise: Exercise 10.2.1 Exercise 10.2.2 4 3.07 8 = 51 ± 2.57 = [48.43, 53.57]