Document 15343515

advertisement
BIO 2802 ECOLOGY LAB
Statistics & Populations Handout
Measures of Central Tendency (mean, median, mode)
mean: The mean is the numerical average of all values in the sample and is written as:
and calculated as:
X
x1  x2  x3  ...  xn
n
As sample size increases, the observed mean (measured mean) tends toward the true mean
(parametric mean) of the population.
median: When the data are organized in progressive sequence (e.g. low to high) the median is the
middlemost value.
1,2,3,3,3,3, 5,6,7,8,8,10,12
mode: The mode is the most frequently occurring value in the data set.
3 (data set above)
In a perfect normal distribution the mean, median, and mode are the same value, by definition.
Measures of Spread (range, variance, standard deviation)
range: The range is the distance between the lowest and highest readings in the data set.
variance: The variance measures the extent to which values in the data set tend to deviate from
the mean. It is a measure of the amount of spread among the values and is written as “S2”,
and calculated as:
x  x x  x
2
S2 
1

2
2
n 1
 ... 
x  x
n
n
standard deviation: The standard deviation is the square root of the variance. Like the variance,
the standard deviation also measures spread among data. When the
underlying distribution of a sample is assumed to be “normal”, 1 standard deviation
around the mean encompasses 68.26% of the data points, and 2 standard deviations
around the mean encompass 95.44% of the data points by definition. The standard
deviation is written as: S
and calculated as:
S  S2
confidence interval: A confidence interval provides a an interval that, with a stated level of
confidence (e.g. 95%, 90%, etc.) includes the population mean. The symbol
α (alpha) represents the level of significance in statistical tests. A significance level
of 5% (α = 0.05) is used in most biological research. A 95% confidence interval for a
given mean is calculated using the estimated mean, the standard deviation of the
mean and a Student’s t-value as:
95% c.i. = X + (tS)
where, t is the critical t-value for a given number of degrees of freedom (d.f. = n -1)
and the designated level of significance (α = 0.05 in this case) and is founding a table
of values or generated in your statistical software (Excel does this for you if you use
the correct formula).
With a 95% confidence interval we can say that we are “95% confident” that the true
population mean (parametric mean) is within the interval, meaning there is a 5%
chance that we are wrong.
Accuracy vs. Precision
While people generally use the terms accuracy and precision interchangeably, they are quite
distinct in the discussion of scientific data.
Accuracy refers to the closeness of the measured value to the true value.
Precision refers to the closeness of repeated measurements to each other.
For example, when sampling a particular species of soil dwelling earthworm (Hale et al 2005),
the population estimate using a mustard liquid extraction technique was consistently 40% lower
than when we measured the same population by digging a pit 40 cm deep and sifting the soil. So
our measures were very precise but not very accurate. If an estimate is consistently low or high,
it is said to be biased. A confidence interval is a good way to express the precision of an
estimate.
Populations
A population is an ecological unit composed of the individuals of a single species
within a given area (i.e. human population of Duluth, bird population of a forest, fish population
of a lake, bacteria population of your gut, rotifer population of a pitcher plant).
Populations have structure:
Density - number of organisms per area (i.e. terrestrial) or volume (i.e. lake)
Dispersion (spacing) - Regular
Random
Clumped
Movement of individuals – with given area or between areas (migration)
Age structure – proportion of individuals in different age classes
Genetic variation – inbreeding (homozygosity), novel mutations
Types of population dispersion and their relation to statistical measures:
regular: This is even or uniform spacing among individuals. Commonly arises from direct
interaction (competition) among individuals. Examples include 1) plant competition for light,
water, and nutrients that can set up regular spacing in a mature forest; 2) territoriality among
animals including a) seabird nesting colonies (e.g. penguins) where birds place their nests just
beyond their neighbor’s reach, and b) black bears which define and defend areas for feeding and
breeding. When regular dispersion is sampled, typically variance<mean, thus
variance/mean < 1
random: This is spacing among individuals that varies in an unpredictable way. Individuals in an
area are distributed without regard to others. Although rare in nature, examples include tree
distribution in a young forest where resources are unlimited, and offspring sapling distribution
around a mature tree. When random dispersion is sampled, typically variance = mean, thus
variance/mean = 1
clumped: This spacing is also called aggregation and describes a situation in which individuals
are found in discrete groups. This is most common in nature and can result from 1) the tendency
for some animals to form social groups (e.g. wolves, primates, ravens); 2) the tendency for
resources to be clumped; 3) the tendency for progeny to remain in the vicinity of their parents
(e.g. birds that remain near the nest such as bee-eaters, trees that reproduce by vegetative
reproduction such as Aspen); and 4) animals that aggregate for defense/offense (fish schools,
large grazing mammals). When clumped dispersion is sampled, typically variance>mean,
thus variance/mean > 1
References Cited:
Brower, J.E., J.H. Zar and C.N. von Ende. 1998. Field and Laboratory Methods for General Ecology, 4th
Edition. McGraw-Hill.
Hale, C.M., L.E. Frelich, P.B. Reich. 2005. Exotic European earthworm invasion dynamics in northern
hardwood forests of Minnesota, USA. Ecological Applications 15(3):848-860.
Download