Sampling unit

advertisement
STAT 680
BIOSTATISTICS
Sampling Methods
estimation
1. Census – complete enumeration
• Measure every individual / attribute of interest.
• Accurate description of the population.
• Drawbacks:
 Only viable with small populations (e.g., reduced # trees)
 Only cost-effective with high-valued features.
2. Sampling – subset of the population
•
Accurate description of the population
•
Drawbacks

Requires planning and (most likely) pre-sampling

Results are accompanied by a confidence interval
2
Sampling
• Sampling – the most used technique to obtain information on
some parameters of interest
• Examples:
– Amounts: average individual weight or height,
– Classification proportions: proportion of individuals which are female, or
proportion diseased
– Totals: total harvestable timber, or total biomass
– Associations: relationship of soil nutrition level to plant biomass, or age of tree to
lumber yield
• Challenges
1. To select units that represent the population of interest
(unbiasness)
2. To use these measurements to estimate the population
parameter of interest
3. To determine the statistical quality of these estimates
3
Defining the population
• Target population: a clearly defined population
from which the sample will be drawn
• Inferential population: a clearly defined
population to which our results will be applied
• Sampled population: the collection of all possible
observation units that might have been chosen in a
sample
4
Defining the sample
• Observation unit: an entity on which a
measurement is taken (also called an element)
• Sample: a subset of the population
• Sampling unit: individual items in a sample
– An observation unit (a person or a household)
– A set of observation units (people from a household)
5
Ideal vs. real sampling situation
Target Population
Inferential Population
Sample
Sampled Population
Target Population
Inferential Population
Sampled population
Sample
6
example
• We are interested in knowing if Tech football fans are
satisfied with on campus parking
–
–
–
–
–
–
Target population: all current season ticket holders
Sample frame: list of the season ticket holders as of 06/2010
Inferential population: all season ticket holders 2009-2010
Observation unit: a household that has season
Sample: those who respond and complete the survey
Sampled population: those season ticket holders that were
on the list as of June 1 and chose to respond
7
Example - graphics
Target Population: 2010 season ticket holders
Inferential population: ticket holders 06/2010
Sampled Population:
People who responded & intend
to attend 2010 games
Sample
Random selected valid
respondents
8
sample selection
• Random selection
– A probability-based selection protocol where each sampling
unit has a known positive probability of being selected
– Probability of selection need not be equal for each unit, as
long as probabilities are known for each unit
• Systematic selection
– First sampling unit is selected randomly, subsequent units are not
– Each sampling unit in the population has the same probability of
being selected
– The probabilities of different sets of units being included in the
sample are not all equal
9
methods of selecting sampling units
• Simple random selection (SRS)
• Systematic random selection (SyRS)
• Stratified random selection (StRS)
SRS
SyRS
StRS
10
When to use SRS
• Sampling frame explicitly lists sampling units
• Sampling units are identified by a location (an (x,y)
pair or a moment in time)
• Assumptions
 Every possible combination of sampling units has an
equal and independent chance of being selected.
 The selection of a particular unit to be sampled is not
influenced by the other units that have been selected
or will be selected.
 Samples are either chosen with replacement or
without replacement.
11
When to use SyRS
• Sampling units are easy to locate.
• Sampling follows a pattern.
• The initial sampling unit is randomly selected. All
other sample units are spaced at uniform intervals
throughout the area sampled.
• Assumption
• There is no pattern in the population
12
When to use STRS
• Stratified Random Sampling should be used
when:
1. The distribution of items is skewed
2. The variability is two large and separate entities
can be identified
•
Allows to draw a more representative sample.
–
(i.e., if there are more individuals of a certain type
in the population the sample has more of that type
and if there are fewer of another type, there are
fewer on the later type in the sample)
13
Download