GG413 Geological Data Analysis Homework #6: Non-Parametric Tests Due Tue. October 7 Reading: Sections 2.3.2-2.3.4 The same requests about explaining your answers, writing out your steps, and labeling plots applies. 1) An environmental study team is investigating the possible contamination from a phosphate plant in an intermontane basin of a western U.S. state. The prevailing winds in the basin blow consistently in the same direction and there is concern that airborne particulate fluorides have been carried over pastures where it may have contaminated grass that is being consumed by grazing cattle. Urinary fluoride concentrations (ppm) collected from ten cows in a pasture that is downwind from the plan, and six cows in a similar pasture but that is upwind of the plant are x1 = [22.5 18.3 19.0 20.9 23.4 16.6 18.5 20.2 21.8 18.6], and x2 = [16.4 17.2 18.4 18.7 17.2 13.3] respectively. From these data you are asked to determine whether the plant’s emissions are enough to alter the fluoride content in the urine of the grazing cattle. You are to use Matlab for this problem and please include a copy of your script as supplemental material only. (a) State the null and alternative hypotheses. In “subplot(311)” plot a histogram of sample 1 (downwind), and then in “subplot(312) plot a histogram of sample 2 (upwind) To facilitate a more direct visual comparison, use “axes” to ensure the axes span the same range of values between the two plots. (b) Use the Mann-Whitney test at the = 0.05 level of significance to test the null hypothesis. Hint: “>>[xsort I]=sort(x)” will return a sorted version of vector x as well as a vector I of the original indices (in vector x) of the values. If you create a vector “d” containing 1’s and 2’s to identify the original source of data in vector “x”, then “>>dsort=d(I)” will give the sources for each value of the sorted array, xsort. “>>k=(dsort==2)” will create a vector k of zeros and ones, with the ones being where the values equal 2. Multiplying k with a vector of ranks and then summing will give you a rank sum. See also “>>help ranksum” (or the Mann-Whitney table from the website). (c) After doing this initial test, you are handed two measurements from two new cows in the upwind pasture. This increases the number of data points in x2 from 6 to 8. In subplot(313) plot a histogram of the new augmented sample x2. Identify two values that would lead you to the opposite conclusion as that in (b). (d) Briefly discuss what you see in your three plots in context of your results in (b)-(c). 2) Permeability has been measured in the lab for samples taken from two different sandstone outcrops. You wish to determine if the outcrops differ using the Kolmogorov-Smirnov test. The measurements are (in units of Darcy): sample from outcrop A: [18.76 13.24 3.83 10.12 8.40 8.60 18.18 15.04 9.62 13.22] sample from outcrop B: [15.90 5.80 1.31 17.50 9.22 6.20 1.92 13.46 2.61 8.01] (a) State the null and alternative hypotheses. (b) Perform the test by using the Matlab script “kolsmir.m” and the two-sample critical values in the table from the website. Use the 95% confidence level. (c) Plot histograms of the two datasets on the same set of axes as well as the plot produced by “kolsmir.m” and briefly discuss what you see. (OVER) 1 GG413 Geological Data Analysis 3) The textural properties of sandstones are usually considered to reflect environmental conditions at the time of formation. Textural maturity is defined as the degree to which a sand is both well sorted and well rounded. These two characteristics are expected to be correlated. However, roundness and degree of sorting are not quantified on a ratio scale but assigned ordinal values such as “moderately sorted” or “well sorted”. Hence, we can rank the assessments but not calculate numerically quantitative statistics from them. Based on the table below, is there a significant (95% level) correlation between roundness and degree of sorting? (P = poor, M = moderate, W = well sorted, and A = angular, SA = subangular, SR = subrounded.) 2