WEM05 Quantitative Techniques for Water & Environmental Management School of Environment & Technology Semester 1 Examinations January/February 2013 WEM05 QUANTITATIVE TECHNIQUES FOR WATER & ENVIRONMENTAL MANAGEMENT Instructions to Candidates: Time allowed: TWO hours Answer ALL questions in Section A (60) and TWO from FOUR in Section B (40) Note that the questions in Section A do not all carry equal marks. Special requirements: Statistical Tables, Mathematical formulae (Bird & May) Items permitted: Any approved calculator, One A4 sheet of notes Calculators may be used provided they are battery-operated, silent and not preprogrammed. 21st January – 1st February 2013 Page 1 of 16 WEM05 Quantitative Techniques for Water & Environmental Management SECTION A Answer ALL questions in this section. Note: Questions in this section do NOT carry equal marks Question 1 (13 marks) In a study into the effects of change of land use on the lakes in northern Wisconsin, measurements were recorded on the watershed area (square kilometres) and lake area (hectares), of 53 lakes. Minitab was used to analyse the results of the study and the output is given below: 600 Lake Area 500 400 300 200 100 0 0 1 2 3 Wshed The regression equation is Lake Area = - 46.7 + 176 Wshed (i) (ii) Interpret the value 176 in the regression equation. Use the following graphs to check the validity of the linear model as a method of predicting lake area. Comment on your findings. (2) (5) Residuals Versus the Fitted Values Normal Probability Plot of the Residuals (response is Area) (response is Area) 200 100 1 Residual Normal Score 2 0 0 -1 -100 -2 -100 0 100 200 Residual (iii) 0 100 200 300 400 500 Fitted Value Calculate estimates of the area of a lake with the following watersheds and comment on their reliability (a) 1.5 square kilometres (3) (b) (3) 3.5 square kilometres Page 2 of 16 WEM05 Quantitative Techniques for Water & Environmental Management Question 2 (5 marks) In an investigation into environmental causes of disease, the data collected included the annual mortality rate per 100,000 males for 61 large towns in England and Wales, and whether the towns were in the north (N) or the south (S). An analysis of these data using Minitab produced the following output. Two-sample T for Mortality/100000 Region N S N 35 26 Mean 1634 1377 StDev 137 140 SE Mean 23 28 Difference = mu (N) - mu (S) Estimate for difference: 256.8 T-Test of difference = 0 (vs >): T-Value = 7.17 0.000 DF = 59 Both use Pooled StDev = 138 P-Value = (i) Write down H0 and H1 for this test. (2) (ii) State when we should use a one-tail two-sample t-test and when we should use a two-tail two-sample t-test. (3) Page 3 of 16 WEM05 Quantitative Techniques for Water & Environmental Management Question 3 (12 marks) A certain type of soil was determined to have a natural pH of 8.75. Ten samples of this soil were treated with an organic fertiliser and the resulting pH measurements were as follows: 8.8 8.7 8.3 8.4 8.1 8.2 8.6 8.5 8.3 8.7 Some analysis was carried out using Minitab and the results are shown below. Variable N pH 10 (i) (ii) (iii) Mean 8.4600 StDev 0.2366 SE Mean 0.0748 Calculate the 95% confidence interval for the mean pH of the soil after treatment and state the assumption that is being made in calculating it. (5) Give an interpretation of the confidence interval and explain what conclusion you can draw about the effect of the fertiliser. (5) Without doing any calculations, state, giving a reason, what the effect would be on the confidence interval of using (a) a larger sample size (b) a 99% level of confidence (2) Question 4 (6 marks) Researchers in the Adirondack Mountains collect data on a random sample of streams each year. One of the variables recorded is the substrate of the stream- the type of soil and rock over which they flow. The researchers found that 69 of the 172 sampled streams had a substrate of shale. Calculate a 95% confidence interval for the proportion of Adirondack streams with a shale substrate. Explain and interpret this interval to a non-statistician. Page 4 of 16 (6) WEM05 Quantitative Techniques for Water & Environmental Management Question 5 (8 marks) The water louse, Asellus, can live in polluted oxygen-poor water. By pumping water over the gills, gill movements can be counted. The number of gill movements per minute of Asellus specimens in stagnant water was compared with that for specimens living in an aquarium in oxygen-rich water, to assess if they differed in the two types of water. Minitab was used to analyse the results and the output is given below: Mann-Whitney Confidence Interval and Test Stagnant N=7 Median = 49.00 Oxygen-rich N=10 Median = 43.5 Point estimate for ETA1 - ETA2 is 5.00 95.5% CI for ETA1-ETA2 is (0.001,10.002) W= 85.5 Test of ETA1= ETA2 vs ETA1 not= ETA2 is significant at 0.0318 (i) Write down H0 and H1 for this test. (2) (ii) State, giving your reasoning, your decision about H0. State the conclusions that can be reached from the analysis. (6) Page 5 of 16 WEM05 Quantitative Techniques for Water & Environmental Management Question 6 (10 marks) Chemical and manufacturing plants often discharge toxic waste materials into nearby rivers and streams. These toxicants have a detrimental effect on the plant and animal life inhabiting the river and the river bank. Measurements are taken of a particular contaminant level, L, (parts per million) along a stream at varying distances x (metres) downstream from a fixed point 0 x 20 . It is found that L and x are related by the equation: L 30 (i) (ii) 1 x 3 50x 2 600x 100 Find an expression for the rate of change of level of contaminant with respect to the distance x. (6) Find the level of contaminant and the rate of change of level in the stream, with respect to the distance x=1. (4) Question 7 (6 marks) An accident at a chemical plant results in a spillage of chemicals into a river. The level of contamination depends on the distance x (metres) downstream. A clean-up operation commences and the rate of change of contamination (with respect to distance) is given by: 4 x r cos 5 5 for 0 x 20 . The level of contaminant at position x = 0 is 5 ppm. Find an expression relating the levels of contamination and distance x. Page 6 of 16 (6) WEM05 Quantitative Techniques for Water & Environmental Management SECTION B Answer TWO questions from this section. Note: each question in this section carries 20 marks Question 8 In a study into the levels of the groundwater contaminant methyl tert-butyl ether (MTBE) in the water supplies of New Hampshire, data were collected from a sample of 223 wells. Each well was classified according to ownership (private or public) and detectable levels of MTBE (below limit or detectable) and the results are shown in the table below. Well ownership Levels of MTBE Private Public Below limit 81 72 Detectable 22 48 (i) State the type of data that is to be analysed. (1) (ii) Carry out a suitable hypothesis test to decide if the data provide evidence of an association between the well ownership and the levels of MTBE in the supply. State clearly the hypotheses and the conclusions. (9) (iii) What are the constraints of the test used in part (ii) (2) (iv) Calculate a 95% confidence interval for the proportion of private wells with detectable levels of MTBE and explain its meaning. (5) The proportion of public wells with detectable levels of MTBE has a 95% confidence interval of (31.2%, 48.8%). Discuss how it relates to parts (i) and (ii). (3) (v) Page 7 of 16 WEM05 Quantitative Techniques for Water & Environmental Management Question 9 An investigation was carried out to see whether soap kills bacteria. Two solutions were prepared, one containing soap and the other a control solution of sterile water. Each solution was placed on ten petri dishes and E. coli bacteria were added. The dishes were incubated for 24 hours and the number of bacteria colonies on each dish was counted. Minitab output from analysing the data is given below: Boxplot of Soap, Control 85 80 Data 75 70 65 60 55 Soap Control Two-Sample T-Test and CI: Soap, Control Two-sample T for Soap vs Control N 10 10 Soap Control Mean 66.00 74.00 StDev 6.06 8.37 SE Mean 1.9 2.6 Difference = mu (Soap) - mu (Control) Estimate for difference: -8.00000 95% CI for difference: (-14.86158, -1.13842) T-Test of difference = 0 (vs not =): T-Value = -2.45 Both use Pooled StDev = 7.3030 (i) P-Value = 0.025 DF = 18 Making use of the boxplots only, compare the bacterial count for the control and the soap. (3) (ii) State the null and alternative hypotheses that have been tested. (2) (iii) State the assumptions made in the hypothesis test. Show how one of these assumptions can be justified from the data. (4) Page 8 of 16 WEM05 Quantitative Techniques for Water & Environmental Management Question 9 continues overleaf… Page 9 of 16 WEM05 Quantitative Techniques for Water & Environmental Management Question 9 continued (iv) (v) (vi) Give a reasoned decision about the hypotheses and state your conclusions clearly. (4) Write down and interpret fully the confidence interval that is given in the Minitab output. (5) Would the 99% confidence interval be wider or narrower than the interval given? Justify your answer. (2) Question 10 The pH of soil specimens taken from each of four sites in an area is given below Site pH 1 7.00 7.00 7.10 7.10 7.05 7.50 7.40 2 3 4 8.30 8.45 8.00 8.00 8.20 7.85 8.05 7.85 8.00 8.15 8.15 8.00 7.65 7.20 7.05 7.70 7.30 7.75 7.80 7.70 7.65 Minitab was used to analyse pH for the four sites. Boxplot of Site1, Site2, Site3, Site4 8.6 8.4 8.2 Data 8.0 7.8 7.6 7.4 7.2 7.0 Site1 Site2 Site3 Site4 Question 10 continues overleaf ... Page 10 of 16 WEM05 Quantitative Techniques for Water & Environmental Management Question 10 continued... Results for: ONEWAYANOVA.MTW Grouping Information Using Tukey Method Site2 Site3 Site4 Site1 N 7 7 7 7 Mean 8.1214 7.8571 7.5643 7.1643 Grouping A A B B C Means that do not share a letter are significantly different. Tukey 95% Simultaneous Confidence Intervals All Pairwise Comparisons Individual confidence level = 98.90% Site1 subtracted from: Lower Center Upper Site2 0.5702 0.9571 1.3441 Site3 0.3059 0.6929 1.0798 Site4 0.0131 0.4000 0.7869 ------+---------+---------+---------+--(-----*-----) (------*-----) (------*-----) ------+---------+---------+---------+---0.60 0.00 0.60 1.20 Site2 subtracted from: Lower Site3 -0.6512 Site4 -0.9441 Center Upper ------+---------+---------+---------+---0.2643 0.1226 (------*-----) -0.5571 -0.1702 (------*-----) ------+---------+---------+---------+---0.60 0.00 0.60 1.20 Site3 subtracted from: Lower Site4 -0.6798 Center Upper -0.2929 0.0941 ------+---------+---------+---------+--(-----*------) ------+---------+---------+---------+---0.60 0.00 0.60 1.20 Question 10 continues overleaf... Page 11 of 16 WEM05 Quantitative Techniques for Water & Environmental Management Question 10 continued... (i) S (3) tate the assumptions made in this analysis. (ii) W ithout referring to any hypothesis test, describe what the box-plots tell you about the data. (3) C (iii) opy and complete the following ANOVA table in your answer book. (5) Source Factor Error Total DF 3 ? 27 SS 3.5388 1.6536 5.1924 MS ? ? F ? P ? (iv) S tate H0 and H1 for this study. Using the completed ANOVA table to draw conclusion about H0. (5) U (v) sing the Tukey test results given overleaf carry out pair wise comparisons for the pH values in the five sites. (4) Page 12 of 16 WEM05 Quantitative Techniques for Water & Environmental Management Question 11 Meteorological conditions have been shown to have some effect on levels of air pollution. To investigate this relationship, the maximum daily levels of a particular oxidant (a photochemical pollutant) were measured for 30 days during one summer. In addition, the morning averages of four meteorological variables were measured: wind speed, temperature, humidity and insolation (a measure of the amount of sunlight). Minitab has been used to analyse the data and fit a multiple regression model to the data. Use the output provided in the following pages to answer the questions. (i) Use the matrix plot to comment on the applicability of a multiple regression model for these data. MatrixPlot 'Wind Speed'-'Oxidant'; 57.5 Wind Speed 42.5 85.75 Temperature 73.25 66.5 Humidity 45.5 66.5 Insolation 37.5 19.75 Oxidant 9.25 .5 .5 4 2 57 .25 .7 5 7 3 85 .5 .5 45 6 6 .5 .5 3 7 66 5 5 9 .2 19 .7 Question 11 continues overleaf... Page 13 of 16 (4) WEM05 Quantitative Techniques for Water & Environmental Management Question 11 continued... (ii) In regression model 1, all four meteorological variables have been included in the regression model. Minitab indicates that there are two unusual observations. What do the R and X beside these observations indicate? What implications might these unusual observations have for the regression model? (5) Regression Analysis: Oxidant versus Wind Speed, Temperature, ... The regression equation is Oxidant = - 15.5 - 0.443 Wind Speed + 0.569 Temperature + 0.0929 Humidity + 0.0228 Insolation Unusual Observations Obs Wind Spe Oxidant 11 47.0 11.000 23 65.0 4.000 Fit 17.586 0.425 SE Fit 0.671 2.170 Residual -6.586 3.575 St Resid -2.32R 1.83 X Residuals Versus the Fitted Values (response is Oxidant) 2.0 Standardized Residual 1.5 1.0 0.5 0.0 -0.5 -1.0 -1.5 -2.0 -2.5 0 5 10 15 20 25 Fitted Value Question 11 continues overleaf/... Page 14 of 16 WEM05 Quantitative Techniques for Water & Environmental Management Question 11 continued... (iii) The stepwise regression procedure in Minitab has been used to decide which variables to include in the multiple regression model. Write out the fitted regression model that you think best describes the data, and use the Minitab output to justify your model choice. (8) Stepwise Regression: Oxidant versus Wind Speed, Temperature, ... Alpha-to-Enter: 0.15 Response is Oxidant Alpha-to-Remove: 0.15 on 4 predictors, with N = Step Constant 1 45.317 2 -5.203 3 -16.607 Wind Spe T-Value P-Value -0.633 -6.30 0.000 -0.427 -4.94 0.000 -0.446 -5.24 0.000 0.52 4.81 0.000 0.60 5.12 0.000 Temperat T-Value P-Value Humidity T-Value P-Value 30 0.098 1.56 0.131 S R-Sq R-Sq(adj) C-p 3.95 58.63 57.15 25.2 2.95 77.73 76.08 3.6 2.87 79.64 77.29 3.2 (iv) As a final check, the standardized residuals for the model in part (iii) were plotted against day. What does the graph indicate? Provide a brief explanation for this pattern. (3) Residuals Versus Day (response is Oxidant) 2.0 Standardized Residual 1.5 1.0 0.5 0.0 -0.5 -1.0 -1.5 -2.0 -2.5 0 10 20 Day 30 Page 15 of 16 WEM05 Quantitative Techniques for Water & Environmental Management Formulae 1. A 95% confidence interval for population mean is given by: x t. s s x t. n n where t is from a t-distribution. 2. For tests of association, use the test statistic: 2 (O – E) 2 E where O is the observed frequency and E is the expected frequency. 3. Useful formulae for integration and differentiation: d dx (constant) = 0 d n-1 n dx (x ) = nx d ax ax dx (e ) = a e d 1 dx (ln ax) = x d dx (sin ax) = a cos ax d dx (cos ax) = –a sin ax n t dt = t n 1 + c n ≠ –1 n 1 1 sin(at) dt = – cos(at) + c a 1 t dt = ln |t| + c e at dt = 1 at e + c a (at b) n1 (at b) dt = a(n 1) + c n cos(at) dt = n ≠ –1 1 sin(at) + c a Page 16 of 16