Sampling and Statistics

advertisement
Statistical Concepts
Basic Principles
An Overview of Today’s Class
What: Inductive inference on characterizing a population
Why : How will doing this allow us to better inventory and
monitor natural resources
Examples
Relevant Readings: Elzinga pp. 77-85 , White et al.
Key points to get out of today’s lecture:
Description of a population based on sampling
Understanding the concept of variation and uncertainty
By the end of today’s lecture/readings you should understand
and be able to define the following terms:
Population parameters
Accuracy/Bias
Sample statistics
Precision
Mean
Coefficient of variation
Variance / Standard Deviation
Why sample?
Inductive inference:
“…process of generalizing to the population from the sample..”
Elzinga –p. 76
Target/Statistical Population
Sample Unit
Individual objects
(in this case, plants)
Elzinga et al. (2001:76)
We are interested in describing this population:
• its total population size
• mean density/quadrat
• variation among plots
At any point in time, these
measures are fixed and a
true value exists.
These descriptive measures
are called ?
Population Parameters
The estimates of these parameters
obtained through sampling are called ?
Sample Statistics
We are interested in describing this population:
• its total population size
• mean density/quadrat
• variation among plots
How did we obtain the sample statistics?
ALL sample statistics are calculated through an estimator
“An estimator is a mathematical expression that indicates
how to calculate an estimate of a parameter from the sample
data.”
White et al. (1982)
You do this all the time!
The Mean (average):

(standard expression, but often
denoted by a some other character)
What is the formal estimator you use?
n
_
=
y  1 / n y (i )
i 1
Which states to do what operations?
Is
y
y
A sample statistic or population parameter ?
is a sample statistic that estimates the population mean
y = population mean if all n units in the population are sampled
Estimating the amount of variability
Why?
Recall:
There is uncertainty in inductive inference.
The field of statistics provides techniques for
making inductive inference AND for providing means of
assessing uncertainty.
Two key reasons for estimating variability:
• a key characteristic of a population
• allows for the estimation of uncertainty of a sample
Think about this conceptually, before mathematically:
Recall lab:
Each group collected
data from 4m2 plots
Did each group get
identical results?
What characteristic of
the population would
affect the level of similarity
among each groups’ samples?
Estimating the Amount of Variation within a Population
The true population standard deviation is a measure of how
similar each individual observation (e.g., number of plants
in a quadrat—the sample unit) is to the true mean
Can we develop a mathematical expression for this?
Populations with lots of variability will have a large standard
deviation, whereas those with little variation will have a low value
High or low?
What would the standard deviation
be if there were absolutely no variabilitythat is, every quadrat in the population
had exactly the same number ?
The Computation of the Population
Variance and Standard Deviation
• key is to get differences among observations, right?
• then each difference is subtracted from the mean–
consistent with definition
First, we calculate the population variance
Var=

N
2=
1/
N  ( Xi  )
2
i 1
Does this make sense ?
For the pop Std Dev, we take the SQRT of the Variance,
std=
 =SQRT(var)
The Computation of the Sample Variance
and Standard Deviation
The estimator of the variance – that is what produces the
sample statistic, simply replaces N with the actual samples (n),
and the true population mean with the sample mean
n
s  (1 / n)  ( Xi  X )
2
2
i 1
The estimator of the standard dev is simply the SQRT of the
estimated variance.
Because of an expected small sample bias, n-1 is usually used
rather than n as the divisor in both the var and stdev
Estimating the Sample Standard Deviation
n
s  SQRT (  ( Xi  X ) / n  1)
2
i 1
Worksheet: compare the sample variation of mass of
deer mice to mass of bison; which is more variable?
Coefficient of Variation:
A measure of relative precision
“The coefficient of variation is useful because,
as a measure of variability, it does not depend upon
the magnitude and units of measurements of the data.”
Elzinga et al: 142
Usually expressed as a percent,
_
CV= s/X * 100
Using the coefficient of variation, what is more variable,
mass of deer mice or bison?
Estimating the Reliability of a Sample Mean
Standard error:
the standard deviation of independent sample means
Measures precision from a sample
(e.g., density of plants from a collection of quadrats)
Quantified the certainty with which the mean computed
from a random sample estimates the true population mean
Key points to get out of today’s lecture:
Description of a population based on sampling
Understanding the concept of variation and uncertainty
Ability to define (and understand) the following terms:
Population parameters
Accuracy/Bias
Sample statistics
Precision
Mean
Coefficient of variation
Variance / Standard Deviation
Friday’s class: from sampling variability to confidence intervals
Download