Module H6 Practical 8&9 “To the Woods” - a sampling game 1. Background Your aim is to conduct a small survey to estimate the total number of trees in a forest and the proportion of large trees. (For this exercise 'large' is to be taken as greater than 30 cms diameter at breast height, d.b.h.). There are two alternative sampling solutions. You should work in pairs to do both sampling scheme A and B. The alternative schemes are given below. The area of forest from which the sample is to be taken is conveniently rectangular. It consists of one region on each side of the River. Within each region it is possible to count the number of trees in any 50m × 50m strip. There are altogether 168 such strips, 96 to the West of the river and 72 to the East. 2. Method Work in pairs to enumerate a sample of 14 strips from the forest. Some properties of the forest are then to be estimated. The main objectives are: To estimate the total number of trees in the forest To estimate the proportion of large trees in the forest. (a) Two sampling schemes are being considered. Scheme A Choose 14 strips at random from the whole forest. This corresponds to choosing 14 numbers at random between 1 and 168 and taking the corresponding strips. Scheme B Choose 8 strips at random from the Western Region (i.e. a random number between 1 and 96) and 6 strips from the Eastern Region (i.e. further random numbers between 1 and 72). To choose the random numbers use the statistical tables provided. (b) Design a form for the data. This should consist of columns for SADC Course in Statistics Module H2 Practical 8&8 – Page 1 Module H6 Practical 8&9 (i) Strip number (ii) Number of small trees (x) (iii) Number of large trees (y) (iv) Overall number of trees, z (z = x + y) There should be enough space for the 14 observations and for summary statistics. (c) Evaluate the results using the method given below. Sections 3 and 4 describe the calculations for the two different sampling schemes if you require more help. (d) Try “further questions” given in the last section. This should be done jointly in groups of four (two of the first set of pairs) because most of the topics benefit from a good discussion. 3. Sampling Scheme A - analysis (for those who collected 14 strips at random from the whole forest). (i) Evaluate the mean number of large trees per strip, y , and the mean overall number of trees per strip, z . (ii) Estimate the proportion of large trees in the forest. (iii) Estimate the total number of trees in the forest. (Note, as there are 168 strips, the estimate is just 168 z ). (iv) What is the standard error of your estimate in (iii)? 4. Sampling Scheme B - analysis (for those who collected data from 8 strips in the Western Region and 6 in the Eastern Region). (i) Evaluate the mean number of large trees per strip, y , and the mean overall number of trees per strip, z . (ii) Estimate the proportion of large trees in the forest. SADC Course in Statistics Module H2 Practical 8&8 – Page 2 Module H6 Practical 8&9 (iii) Estimate the total number of trees in each region. (Note, for the Western Region there are 96 strips, so the estimate is 96 z , similarly for the Eastern Region where there are 72 strips. Hence estimate the total number of trees in the forest (this is just the sum of the trees in each region). (iv) Can you find the standard error of your estimate in (iii)? 5. Further questions (a) In comparing your estimates with someone else, you will find their estimates are almost certainly different but they are of the same order of magnitude. This is except for the value of the standard error in (iv), where the value using sampling scheme A is almost certainly much larger than the value using scheme B. Explain why this is so. (b) There are two alternative ways of estimating the proportion of large trees in the forest. What are they? Which is better? (c) If you used sampling method B, but then ignored this in the analysis (i.e. analysed the data as though you had collected it according to method A) you would get the same results for (i), (ii) and (iii) as with the proper analysis for scheme B. Why? Will this always happen? (d) Explain what is wrong with the following argument about the proportion of large trees. "Trees are classified as either 'large' or 'small' - just like coin tossing where you have 'heads' or 'tails'. Therefore the total number of 'large' trees in our sample is binomially distributed and hence the properties of the binomial distribution can be used to give confidence intervals for the proportion of large trees in the population". SADC Course in Statistics Module H2 Practical 8&8 – Page 3