Module H2 Practical 11 Tests on population variances Objectives: By the end of this practical you should be able to: conduct a chi-square test for the variance. interpret p-value from the test. explain the implication of increasing the sample size. conduct an F-test for comparing two variances using Excel. 1. A market vendor in a district in Western Uganda, regularly purchases bunches of bananas from a small-scale commercial farmer. In the past, he found the variability in weights was low with standard deviation 3.5 kg. After a period when the farm suffered from an attack of banana streak virus, the vendor felt that the variation in bunch weights had increased. He measured 25 bunches selected at random over a period of three months, and kept a record of their bunch weights. The sample standard deviation was found to be 4.7 kg. Is there evidence that the variance in weights has increased? Set up null and alternative hypotheses for this problem and carry out the test using Excel facilities to do “hand” calculations. You may obtain the exact p-value for a chi-square test statistics by using the Excel function CHIDIST(value, df). SADC Course in Statistics Module H2 Practical 11 – Page 1 Module H2 Practical 11 2. Open the file H2_Practical11.xls. The null and alternative hypotheses are given in the worksheet Variances. For fixed values of the population variance under the null hypothesis and sample variance, the aim is to investigate the effect of increasing the sample size on the test results. Change the sample size in cell D2 using n = 9, 15, 30, 45, 100 and 200 and record your results in the table below: Sample size Chi-square statistic p-value Decision 9 15 30 45 100 200 (a) Complete the final column of table above. Then describe what you observe and explain. (b) If the null hypothesis was true, and real samples were taken, how would you expect s 2 to change as the sample size increases? Explain what implication this would have on your answer to Part (a). (c) When the sample size is 9, change the value of s 2 until you obtain a value for which you can reject the null hypothesis. What is the smallest value of s 2 for which you can reject the null hypothesis at a 5% significance level when the sample size is 9. (d) Observe the Excel formulae used to calculate the chi-square statistic and the p-value. Can you make sense of the formulae? Note them down below for future reference. Formula for chi-square statistic ………………………………………………….. Formula for the p-value ……………………………………………………..…… SADC Course in Statistics Module H2 Practical 11 – Page 2 Module H2 Practical 11 3. In this example you will conduct an F-test to determine whether the total annual rainfall (measured in millimetres) at a particular location, namely a privately owned large scale farm in Monze in Southern Zamiba, has been more variable in recent times, i.e. from 1984-2003, compared to its variation with a comparable period in the past, i.e. in the period from 1922-1941. The data for these two periods are available in the worksheet named ZambiaRain in file H2_data.xls. (a) First set up a null hypothesis and an alternative hypothesis to answer the question of interest. (b) Next carry out the F-test. This may be done on Excel. Instructions for this are given below. Click Tools, Data Analysis and then F-Test Two-Sample for Variances. The following box will appear Enter the data ranges as show below. A1-A21 is for the period 1922-1941 and B1B21 for the period 1984-2003. The tick on Labels means that the first entry in each case is a variable name. SADC Course in Statistics Module H2 Practical 11 – Page 3 Module H2 Practical 11 Report your results in the following table. old22-41 new84-03 Mean Variance F-statistic p-value = = with d.f. = ( , ) (c) Interpret results above and note down your conclusions from the F-test. 4. IF YOU HAVE TIME, TRY ALSO THE FOLLOWING: As part of a health survey, cholesterol levels of men in a small rural area were measured, including those working in agriculture, and those employed in non-agricultural work. The main objective was to see if cholesterol levels were different between the two groups. This objective is addressed by conducting a t-test (contents of next session). An assumption underlying this test is that the variances of the two groups are not significantly different. Carry out a suitable test for this purpose. The data are in the worksheet agricoles in the file H2_data.xls. Note down your conclusions. SADC Course in Statistics Module H2 Practical 11 – Page 4