MATH2411 Null Hypothesis H0 : µX − µY = d0 H0 : µX − µY = d0 Applied Statistics Condition σX , σY known σX , σY unknown Test Statistics x − y − d0 z0 = q 2 2 σY σX n + m t0 = x − y − d0 q 1 sp n1 + m σX = σY H0 : µX − µY = d0 σX , σY unknown x − y − d0 t0 = q 2 s2Y sX n + m µX , µY unknown Alternative Hypothesis H1 : µX − µY 6= d0 H1 : µX − µY > d0 s2 f0 = X2 r0 sY Define the number α := P (Type I error) = P (reject H0 |H0 is true ), α is called the Significance Level of the test. Define the number β := P (Type II error) = P (not reject H0 |H0 is false ). 1 − β is called the Power of the test. Rejection Criteria |z0 | > z α2 z0 > zα H1 : µX − µY < d0 H1 : µX − µY 6= d0 H1 : µX − µY > d0 H1 : µX − µY < d0 H1 : µX − µY 6= d0 H1 : µX − µY > d0 H1 : µX − µY < d0 H1 : 2 σX > r0 σY2 σX 6= σY σ2 H0 : X = r0 σY2 Tutorial Notes 5 Distribution α z0 < −zα a Z ∼ N (0, 1) 0.025 z0.025 = |t0 | > tn+m−2, α2 b T ∼ t7 0.025 t7,0.025 = t0 > tn+m−2,α c X 2 ∼ χ29 0.025 χ29,0.025 = d F ∼ F5,7 0.05 f0.05 (5, 7) = e Z ∼ N (0, 1) 0.005 z0.005 = f T ∼ t22 0.005 t22,0.005 = g X 2 ∼ χ223 0.005 χ223,0.005 = h F ∼ F7,5 0.05 f0.05 (7, 5) = t0 < −tn+m−2,α |t0 | > tk, α2 t0 > tk,α t0 < −tk,α f0 > fα (n − 1, m − 1) H1 : 2 σX 6= r0 σY2 f0 > f α2 (n − 1, m − 1) or 1 f0 < f α2 (m − 1, n − 1) H1 : 2 σX < r0 σY2 f0 < where: 2 (n − 1)SX + (m − 1)SY2 Sp2 = is called the pooled sample variance, 2 n+m − 2 2 2 SX SY + n m and k = 2 2 2 2 . SX SY 1 1 + m−1 n−1 n m Warm-up (Distribution of Sample Mean) Check the distribution table and fill in the blanks: 1 fα (m − 1, n − 1) Example 1 (Test for equality of population variances and means) The following are the burning times(in minutes) of candles of two different brands. Assuming all samples are randomly drawn and assume burning time of the two brands are both normal distribution. Sample burning time Brand X Brand Y 63 82 81 68 57 64 56 72 63 83 59 66 75 82 73 74 59 82 65 82 (a) For α = 0.1, test if two populations have got the same variance. (b) Based on your result in (a), for α = 0.1, test if the mean burning time of the two brands are equal. Exercise 1 (Test for difference between population means) To test whether or not HKUST professors’ average monthly salary is $4000 higher than that of the professors from other institutions, a random sample of 50 professors from HKUST are drawn and it shows that their average monthly salary is $81,750. Also a random sample of 200 professors from other institutions are drawn and it shows that their average monthly salary was $77,500. Test the hypothesis with a 0.05 level of significance, assuming that both HKUST and non-HKUST professors’ monthly salaries follow normal distributions with the same population standard deviation being $5000. Example 2 (2012 Spring Final Exam) A recent article in the British Journal Lancet reports that babies who were fed by mother’s milk tended to have a higher IQ than formula-fed babies, Suppose that two groups of babied are compared, one group fed by mother’s milk and the other group fed by formula milk powder. The IQ scores are listed below: IQ Score Mother Fed Formula Fed 121 105 111 119 108 101 110 107 98 101 90 131 106 112 103 86 117 113 (b) Hence or otherwise, at a 0.05 level of significance, test H0 : µX = µY + 25 against H1 : µX 6= µY + 25. 89 87 Assume IQ scores are normally distributed with population mean µX and population 2 for mother-fed babies, population mean µY and population variance σY2 variance σX 2 6= σY2 . for formula-fed babies. Assume that σX (a) Construct a 95% confidence interval for the difference between the IQ mean scores µX − µY . A brief summary of course materials: 1. Error Sum of Squares, SSE = ni k X X (xij − xi )2 i=1 j=1 Treatment Sum of Squares, SStreat = k X ni (xi − xall )2 i=1 2. M SE = 3. F = SSE n−k and M Streat = SStreat k−1 M Streat and if F > fα (k − 1, n − k), M SE then hypothesis µ1 = µ2 = · · · = µn is rejected at significant level α Example 3 To study if exam performance is affected by the background sound, 12 student volunteers from MATH 2411 class are randomly assigned to 3 exam rooms to complete the same standardized test in statistics, each exam room has 4 students. Rock music is played in Room X, light music is played in Room Y, while there is so special background sound in Room Z. The test scores of the 12 students are shown in the following table. Student Student Student Student 1 2 3 4 Group 1, Room X 50 55 45 40 Group 2, Room Y 75 65 60 60 Group 3, Room Z 65 50 65 70 (f) Calculate SSE, the Error Sum of Squares, where SSE = SSX + SSY + SSZ . (g) Calculate the mean score of all the 12 students, call it M . (a) Write down the sample size n = (b) Write down the number of groups k = (c) Write down the number of sampling point in each group. n1 = , n2 = , · · · , nk = (h) Calculate the Treatment Sum of Square (SStreat ), k X where SStreat = ni (Mi − M )2 . i=1 (d) Calculate the sample mean score of each group, namely M1 = X, M2 = Y and M3 = Z. (i) For α = 0.05, fill in the following Table: Source Degree of freedom Sum of Squares Error n−k = SSE = Mean Sum of Squares SSE M SE = n−k = (e) Calculate the Sum of Squares (SS) of each group, namely SSX , SSY and SSZ . Treatment k−1 = SStreat = M Streat = . SStreat k−1 F -Value F = M Streat = M SE fα (k − 1, n − k) = = Hence test the Null Hypothesis H0 : µX = µY = µZ at significant level α = 0.05. Exercise 2 A research on young children’s mental arithmetic ability is being conducted on some native English speaking, Chinese Speaking, and Italian speaking pupils, all of 8 years old. The list below is a summary on total number of single digit multiplication questions each of them can answer in a unit period of time. Group 1, English Speaking 3 6 7 4 Group 2, Chinese Speaking 10 12 11 14 8 6 Group 3, Italian Speaking 8 3 2 5 M SE = SSE = n−k M Streat = SStreat = k−1 Conduct ANOVA Test at significant level α = 0.05 to see on average whether the pupil from different language groups have the same level of mental arithmetic ability. n= xeng = , k= ; neng = xchi = xita = , nchi = , nita = xall = SSE = F = M Streat = M SE fα (k − 1, n − k) = SStreat = ∴ the conclusion is : that we reject H0 at significant level α = 0.05. Example 4 The following shows the number of subjects with credits for MATH major students in 2012-2013 Fall Semester. SStreat = (a) Fill in the given table. No. of subjects with credits 0 1 2 3 4 5 6 7 8 Total Credits no. × boy freq. FreqBoy 17 22 12 6 4 16 11 7 5 -uency Girl 4 8 12 15 22 17 10 5 2 Credits no. × girl freq. ∴ M Streat = ∴F = while fα (k − 1, n − k) = (b) Conduct ANOVA Test at sinificant level α = 0.01 to see if boys and girls have the same number of subjects with credits. so SSE = SSboy + SSgirl ∴ M SE = (Answers will be available at http://ihome.ust.hk/~makittylee)