BIO 2802 ECOLOGY LAB Statistics & Populations Handout Measures of Central Tendency (mean, median, mode) mean: The mean is the numerical average of all values in the sample and is written as: and calculated as: X x1 x2 x3 ... xn n As sample size increases, the observed mean (measured mean) tends toward the true mean (parametric mean) of the population. median: When the data are organized in progressive sequence (e.g. low to high) the median is the middlemost value. 1,2,3,3,3,3, 5,6,7,8,8,10,12 mode: The mode is the most frequently occurring value in the data set. 3 (data set above) In a perfect normal distribution the mean, median, and mode are the same value, by definition. Measures of Spread (range, variance, standard deviation) range: The range is the distance between the lowest and highest readings in the data set. variance: The variance measures the extent to which values in the data set tend to deviate from the mean. It is a measure of the amount of spread among the values and is written as “S2”, and calculated as: x x x x 2 S2 1 2 2 n 1 ... x x n n standard deviation: The standard deviation is the square root of the variance. Like the variance, the standard deviation also measures spread among data. When the underlying distribution of a sample is assumed to be “normal”, 1 standard deviation around the mean encompasses 68.26% of the data points, and 2 standard deviations around the mean encompass 95.44% of the data points by definition. The standard deviation is written as: S and calculated as: S S2 confidence interval: A confidence interval provides a an interval that, with a stated level of confidence (e.g. 95%, 90%, etc.) includes the population mean. The symbol α (alpha) represents the level of significance in statistical tests. A significance level of 5% (α = 0.05) is used in most biological research. A 95% confidence interval for a given mean is calculated using the estimated mean, the standard deviation of the mean and a Student’s t-value as: 95% c.i. = X + (tS) where, t is the critical t-value for a given number of degrees of freedom (d.f. = n -1) and the designated level of significance (α = 0.05 in this case) and is founding a table of values or generated in your statistical software (Excel does this for you if you use the correct formula). With a 95% confidence interval we can say that we are “95% confident” that the true population mean (parametric mean) is within the interval, meaning there is a 5% chance that we are wrong. Accuracy vs. Precision While people generally use the terms accuracy and precision interchangeably, they are quite distinct in the discussion of scientific data. Accuracy refers to the closeness of the measured value to the true value. Precision refers to the closeness of repeated measurements to each other. For example, when sampling a particular species of soil dwelling earthworm (Hale et al 2005), the population estimate using a mustard liquid extraction technique was consistently 40% lower than when we measured the same population by digging a pit 40 cm deep and sifting the soil. So our measures were very precise but not very accurate. If an estimate is consistently low or high, it is said to be biased. A confidence interval is a good way to express the precision of an estimate. Populations A population is an ecological unit composed of the individuals of a single species within a given area (i.e. human population of Duluth, bird population of a forest, fish population of a lake, bacteria population of your gut, rotifer population of a pitcher plant). Populations have structure: Density - number of organisms per area (i.e. terrestrial) or volume (i.e. lake) Dispersion (spacing) - Regular Random Clumped Movement of individuals – with given area or between areas (migration) Age structure – proportion of individuals in different age classes Genetic variation – inbreeding (homozygosity), novel mutations Types of population dispersion and their relation to statistical measures: regular: This is even or uniform spacing among individuals. Commonly arises from direct interaction (competition) among individuals. Examples include 1) plant competition for light, water, and nutrients that can set up regular spacing in a mature forest; 2) territoriality among animals including a) seabird nesting colonies (e.g. penguins) where birds place their nests just beyond their neighbor’s reach, and b) black bears which define and defend areas for feeding and breeding. When regular dispersion is sampled, typically variance<mean, thus variance/mean < 1 random: This is spacing among individuals that varies in an unpredictable way. Individuals in an area are distributed without regard to others. Although rare in nature, examples include tree distribution in a young forest where resources are unlimited, and offspring sapling distribution around a mature tree. When random dispersion is sampled, typically variance = mean, thus variance/mean = 1 clumped: This spacing is also called aggregation and describes a situation in which individuals are found in discrete groups. This is most common in nature and can result from 1) the tendency for some animals to form social groups (e.g. wolves, primates, ravens); 2) the tendency for resources to be clumped; 3) the tendency for progeny to remain in the vicinity of their parents (e.g. birds that remain near the nest such as bee-eaters, trees that reproduce by vegetative reproduction such as Aspen); and 4) animals that aggregate for defense/offense (fish schools, large grazing mammals). When clumped dispersion is sampled, typically variance>mean, thus variance/mean > 1 References Cited: Brower, J.E., J.H. Zar and C.N. von Ende. 1998. Field and Laboratory Methods for General Ecology, 4th Edition. McGraw-Hill. Hale, C.M., L.E. Frelich, P.B. Reich. 2005. Exotic European earthworm invasion dynamics in northern hardwood forests of Minnesota, USA. Ecological Applications 15(3):848-860.