GEOG 3810 Morphological Measurements of Fossils in Manitoba Quantitative Research Methods in Geography: Take Home Assignment Delaney Brooks 7694719 Morphological Measurements of Fossils in Manitoba Delaney Brooks Part A: Introduction and Sampling Methods The excavation of a large number of geological sites in Manitoba have shown a reoccurring appearance of many specimens of a specific fossil type. The morphological measurements of the fossils excavated is important in comparing the differences between Manitoba geological sites. The intention of this project for the Manitoba Government by AZZOCA Consulting is to determine whether the morphological measurements, in centimeters, of a specific fossil type that has been excavated from geological Site A is greater than the morphological measurements (cm) of the same fossil type excavated from geological Site B, both in Manitoba. For this study, the target population is all of the fossils of the particular type of fossil. The target area is all of the geological sites in Manitoba. The study will be on the observations of only two sites, which is known as the sampling area. Commonly excavated fossils in Manitoba include: Mosasaurs, Plesiosaurs, sharks, a variety of fish, turtles, birds, and squid (Canadian Fossil Discovery Centre, 2015). Although there are many geological sites in Manitoba, only two have been selected for the comparison. The geological sites that will be selected for the study may have any variety of the ancient animals listed above. The samples collected (the sampling population) are from specific locations; therefore, a spatial sampling method shall be used. The most effective sampling method will be a Stratified Point Sample. To retrieve samples, the sites being used will each be divided into sub-areas, or smaller, more manageable sections. The number and size of the sub-areas will be dependent on the sample size needed and the size of the sites, respectively. There may be a difference in sample size for Site A and Site B. Once the sample sizes have been identified, each site will be divided into the same number of sub-areas as sample sizes are needed; each sub-area in a site being of 1 Morphological Measurements of Fossils in Manitoba Delaney Brooks equal size. This creation of sub-areas, or subsections, will ensure that fossils used in the study are selected from all areas of the geological sites. Next, one fossil will be randomly selected from each sub-area. This will result in a random sampling of fossils excavated. As well, the population will be accurately represented from all areas of the two geological sites. Sample size is important for representing the population. As a general rule, 30 is an ideal minimum sample size. However, as 30 may be more or less than the necessary number of samples taken, the actual sample sizes needed for Site A and Site B will be calculated from the data in Table 1. The sample size must be calculated for both Site A and Site B as a representation of fossils is needed from both geological sites to fulfill our hypothesis test. Before the sample sizes can be calculated, the margin of error (E) that the study is willing to accept must be calculated. The error for Site A and Site B are shown below: Site A Site B 𝜎2 𝜎2 𝐸 = 𝑍√ 𝑛 𝐸 = 1.96√ 𝐸 = 𝑍√ 𝑛 (0.7729)2 𝐸 = 1.96√ 10 𝐸 = 0.4790 (0.7182)2 10 𝐸 = 0.4451 To find the sample size for each site, we will use the margin of error. The calculations for sample size of Site A and Site B are as follows: Site A Site B 𝑍𝑠 𝑍𝑠 𝑛 = ( 𝐸 )2 𝑛= ( 𝑛 = ( 𝐸 )2 (1.96)(0.7729) 2 ) 0.4790 𝑛= ( 𝑛 = 10.002 = 11 (1.96)(0.7182) 2 ) 0.4451 𝑛 = 10.002 = 11 2 Morphological Measurements of Fossils in Manitoba Delaney Brooks The calculations show, for an accurate representation of the population, a sample size of eleven will suffice for both Site A and Site B, with a total of twenty two samples. Although the calculations show that the sample size would be closer to ten, the sample size must be rounded up to ensure an accurate representation. The minimum necessary sample size for both Site A and Site B are much smaller than 30. Using the calculations to find the needed sample sizes allows for a more cost effective experiment. The more samples that are gathered results in a higher cost and a longer time commitment to the study. 3 Morphological Measurements of Fossils in Manitoba Delaney Brooks Part B: Descriptive Statistics Table 2 The Descriptive Statistics of the data from Site A and Site B. Descriptive Statistics N Minimum Maximum Mean Std. Deviation Site_A 10 1.32 3.64 2.0850 .77292 Site_B 10 1.30 3.78 1.8900 .71821 Valid N (listwise) 10 Table 2 (above) exhibits the descriptive statistics from the data provided of morphological measurements of the selected type of fossil excavated from geological Site A and Site B. The sample size, denoted as N, is the same for Site A and Site B at 10 samples. It is shown that the minimum morphological measurements for both sites are very close; Site A with a minimum measurement of 1.32 and Site B with a minimum measurement of 1.30. The difference between the maximum morphological measurements of Site A and Site B are greater at 3.64 and 3.78, respectively. The mean, or average, of the morphological measurements of the specific fossil type at each site varies greatly. The mean of Site A is slightly larger, at 2.0850, than the mean of Site B, at 1.8900. The difference between the two means appears to coincide with the hypothesis stating the morphological measurements of the specific fossil type is larger at Site A than Site B. However, statistically the mean of Site A and Site B may be equal. Testing the hypothesis would have to occur to conclude whether the mean of Site A is in fact greater than the mean of Site B. The standard deviation of Site A and Site B are also fairly close numerically. The standard deviation is the amount of variation in a data set. Therefore the standard deviation of Site A, being 0.77292, is more variable that the standard deviation of Site B, at 0.71821. 4 Morphological Measurements of Fossils in Manitoba Delaney Brooks Figure 1 The boxplots representing the descriptive statistics from Site A and Site B. Figure 1 (above) demonstrates visually the descriptive statistics in the form of boxplots. The boxplot on the left is representing the morphological measurements data of Site A and the boxplot on the right represents the morphological measurements data of Site B. Although Site B has a higher maximum value, Site A has a larger range. This is due to the maximum value of Site B (3.78) being an outlier. An outlier is a value that is outside of the standard deviation of the dataset. The outlier may be a result of an inaccurate measurement. In a boxplot the median of the data is shown as a line on the inside of the box. It is seen that the median of Site A is skewed to 5 Morphological Measurements of Fossils in Manitoba Delaney Brooks the right, meaning the majority of the dataset points are of a smaller length (in centimeters). Site B has a median that is roughly centre, making it appear to be normally distributed, or equal to the mean. It is impossible to know whether Site A or Site B are truly normally distributed by only observing a boxplot. The interquartile range is the middle fifty percent of the data, making up the box part of a boxplot. Site A has a much larger interquartile range than Site B. This means the middle fifty percent of the data in Site A is spread over a much larger range than Site B. The whiskers of the boxplot represent the lower twenty-five percent and upper twenty-five percent of the data. It is observed that Site A has the upper twenty-five percent of its data spread over a wider range, also resulting in a larger range. The upper and lower twenty-five percent of data for Site B is spread over a smaller range of measurements, contributing to the smaller range. Overall, by observing the boxplots and the descriptive statistics, some inconclusive assumptions can be drawn about the morphological measurements of the selected fossil type in the two geological sites in Manitoba. The boxplots and descriptive analysis show that the data from Site A is skewed to the right, has a higher mean, and larger range and interquartile range than the data from Site B. It is also observed that Site B has an outlier which may be due to a misrepresentation of data. However, to have conclusive evidence to answer the hypothesis of whether the morphological measurements at Site A are greater than the morphological measurements at Site B, hypothesis testing will have to occur. 6 Morphological Measurements of Fossils in Manitoba Delaney Brooks Part C: Methods and Results As the intent of this study is to determine if the morphological measurements of a specific fossil type in Manitoba is greater at Site A than Site B, a Two Sample Difference test will be used. However, before the hypothesis can be tested, the data for Site A and Site B must be tested to observe whether it meets the assumptions for a parametric or non-parametric test. The data for both Site A and Site B must be tested for normal distribution. This was done by applying a Kolmogorov-Smirnov test to both datasets. In the test, a significance level of 95% (0.05) is used, meaning we are 95% confident that the sample mean falls within two standard deviations of the true mean. The results are shown below in Table 3 and Table 4. Table 3 The Kolmogorov-Smirnov Test results from the fossil data of Site A. Tests of Normality Kolmogorov-Smirnova Statistic Site_A df .244 Sig. 10 Shapiro-Wilk Statistic .094 df .850 Sig. 10 .058 Table 4 The Kolmogorov-Smirnov Test results from the fossil data of Site B. Tests of Normality Kolmogorov-Smirnova Statistic Site_B .306 df Sig. 10 .008 Shapiro-Wilk Statistic .725 df Sig. 10 .002 The p-value for Site A is 0.094 which is larger than the significance level (0.05) meaning the data from Site A is normally distributed. Therefore Site A meets all the assumptions of a 7 Morphological Measurements of Fossils in Manitoba Delaney Brooks parametric test; the data is normally distributed, randomly sampled, and is interval/ratio data. The p-value for Site B is 0.008 which is much smaller than the significance level of 0.05 and therefore there will be less than a 5% chance of making a type one error. This results in the data from Site B being non-parametric; the data is not normally distributed, it is randomly sampled, and is interval/ratio data. Due to Site A being parametric and Site B being non-parametric, a nonparametric test will be run to test the hypothesis. The null hypothesis of the study is stated as the morphological measurements of the fossil type at geological Site A in Manitoba is equal to the morphological measurements of the fossil type at geological Site B in Manitoba. The alternative hypothesis is stated as the morphological measurements of the fossil type at geological Site A in Manitoba is greater than the morphological measurements of the fossil type at geological Site B in Manitoba. A MannWhitney test is performed to determine whether the null hypothesis will be rejected or not rejected. The Mann-Whitney test has its own set of assumptions; the samples must have a similar shape of distribution, the data must be ordinal, and the two samples being used must be independent of one another and be randomly sampled. The datasets meet these assumptions and therefore the test can be run. This significant level for the test is 0.05 and the confidence interval will be set at 1.65 due to the alternative hypothesis being a one-tailed test (greater than). The outcome of the test is below in Table 5: 8 Morphological Measurements of Fossils in Manitoba Delaney Brooks Table 5 The Mann-Whitney Test results for the difference test of Site A and Site B. Table 5 clearly states that the null hypothesis will not be rejected. This is due to the pvalue (0.631) being larger than the significance level (0.05). Therefore, the Mann-Whitney Two Sample Difference test proves the null hypothesis. The Mann-Whitney Two Sample Difference test can also be calculated. The calculations use the critical value and coincides with the p-value. When the calculations are done for the datasets of Site A and Site B, the test statistic is -0.5291 which is smaller than the significance level of ±1.65. The Mann-Whitney Two Sample Difference Test and the Kolmogorov-Smirnov Test may be inaccurately depicting the population of the fossils excavated. The sample sizes used in the tests is ten which is lower than the minimum ideal sample size of eleven. As well, the boxplots made it appear as though the shape of the distribution is drastically different between Site A and Site B. The study was limited by the low number of sample sizes for both Site A and Site B. In conclusion, and as an answer to the Minister, AZOCCA Consulting has confirmed the fossils excavated at the geological Site A in Manitoba are not greater in morphological measurement than the fossils excavated in the geological Site B in Manitoba. This is only true when using the dataset that was available and may differ if there was fewer limitations. 9 Morphological Measurements of Fossils in Manitoba Delaney Brooks References Canadian Fossil Discovery Centre. (2015). General Information. Retrieved from Canadian Fossil Discovery Centre: www.discoverfossils.com Quanititative Research Methods in Geography. (2015). Lecture Notes. John Iacozza. Winnipeg, Manitoba. (University of Manitoba). 10