Uploaded by Priyansh Kela

Sampling distributions

advertisement
Sampling and Sampling Distributions
Chap 7-1
Learning Objectives
In this chapter, you will learn:
 To distinguish between different survey
sampling methods
 The concept of the sampling distribution
 To compute probabilities related to the
sample mean and the sample proportion
 The importance of the Central Limit
Theorem
Chap 7-2
Why Sample?
 Selecting a sample is less time-consuming
than selecting every item in the population
(census).
 Selecting a sample is less costly than
selecting every item in the population.
 An analysis of a sample is less cumbersome
and more practical than an analysis of the
entire population.
Chap 7-3
Types of Samples
Samples
Non-Probability
Samples
Judgment
Quota
Chunk
Convenience
Probability Samples
Simple
Random
Stratified
Systematic
Cluster
Chap 7-4
Types of Samples
 In a nonprobability sample, items included
are chosen without regard to their probability
of occurrence.
 In convenience sampling, items are selected
based only on the fact that they are easy,
inexpensive, or convenient to sample.
 In a judgment sample, you get the opinions of
pre-selected experts in the subject matter.
Chap 7-5
Types of Samples
 In a probability sample, items in the sample
are chosen on the basis of known probabilities.
Probability Samples
Simple
Random
Systematic
Stratified
Cluster
Chap 7-6
Simple Random Sampling
 Every individual or item from the frame has an
equal chance of being selected
 Selection may be with replacement (selected
individual is returned to frame for possible
reselection) or without replacement (selected
individual isn’t returned to the frame).
 Samples obtained from table of random numbers
or computer random number generators.
Chap 7-7
Systematic Sampling
 Decide on sample size: n
 Divide frame of N individuals into groups of k
individuals: k=N/n
 Randomly select one individual from the 1st group
 Select every kth individual thereafter
For example, suppose you were sampling n = 9
individuals from a population of N = 72. So, the
population would be divided into k = 72/9 = 8 groups.
Randomly select a member from group 1, say
individual 3. Then, select every 8th individual
thereafter (i.e. 3, 11, 19, 27, 35, 43, 51, 59, 67)
Chap 7-8
Stratified Sampling
 Divide population into two or more subgroups
(called strata) according to some common
characteristic.
 A simple random sample is selected from each
subgroup, with sample sizes proportional to strata
sizes.
 Samples from subgroups are combined into one.
 This is a common technique when sampling
population of voters, stratifying across racial or
socio-economic lines.
Chap 7-9
Cluster Sampling
 Population is divided into several “clusters,” each
representative of the population.
 A simple random sample of clusters is selected.
 All items in the selected clusters can be used, or items
can be chosen from a cluster using another probability
sampling technique.
 A common application of cluster sampling involves
election exit polls, where certain election districts are
selected and sampled.
Chap 7-10
Comparing Sampling Methods
 Simple random sample and Systematic sample
 Simple to use
 May not be a good representation of the population’s
underlying characteristics
 Stratified sample
 Ensures representation of individuals across the entire
population
 Cluster sample
 More cost effective
 Less efficient (need larger sample to acquire the same
level of precision)
Chap 7-11
Evaluating Survey Worthiness
 What is the purpose of the survey?
 Were data collected using a non-probability




sample or a probability sample?
Coverage error – appropriate frame?
Nonresponse error – follow up
Measurement error – good questions elicit good
responses
Sampling error – always exists
Chap 7-12
Types of Survey Errors
 Coverage error or selection bias
 Exists if some groups are excluded from the frame
and have no chance of being selected
 Non response error or bias
 People who do not respond may be different from
those who do respond
 Sampling error
 Chance (luck of the draw) variation from sample to
sample.
 Measurement error
 Due to weaknesses in question design, respondent
error, and interviewer’s impact on the respondent
Chap 7-13
Sampling Distributions
 A sampling distribution is a distribution of all of the
possible values of a statistic for a given size sample
selected from a population.
 For example, suppose you sample 50 students from
your college regarding their mean GPA. If you
obtained many different samples of 50, you will
compute a different mean for each sample. We are
interested in the distribution of all potential mean GPA
we might calculate for any given sample of 50
students.
Chap 7-14
Sampling Distributions
Sample Mean Example
 Suppose your population (simplified) was
four people at your institution.
 Population size N=4
 Random variable, X, is age of individuals
 Values of X: 18, 20, 22, 24 (years)
Chap 7-15
Sampling Distributions
Sample Mean Example
Summary Measures for the Population Distribution:
X

μ
P(x)
i
N
18  20  22  24

 21
4
σ
 (X  μ)
i
N
2
 2.236
.3
.2
.1
0
18
20
22
A
B
C
24
x
D
Uniform Distribution
Chap 7-16
Sampling Distributions
Sample Mean Example
Now consider all possible samples of size n=2
1st
Obs.
16 Sample
Means
2nd Observation
18
20
22
24
18
18,18 18,20 18,22 18,24
20
20,18 29,20 20,22 20,24
22
22,18 22,20 22,22 22,24
24
24,18 24,20 24,22 24,24
16 possible samples
(sampling with
replacement)
1st
Obs.
2nd Observation
18
20
22
24
18
18
19
20
21
20
19
20
21
22
22
20
21
22
23
24
21
22
23
24
Chap 7-17
Sampling Distributions
Sample Mean Example
Sampling Distribution of All Sample Means
16 Sample
Means
1st
Obs
P(X)
2nd Observation
Sample Means
Distribution
.3
18
20
22
24
18
18
19
20
21
20
19
20
21
22
22
20
21
22
23
24
21
22
23
24
.2
.1
0
18 19 20 21 22 23 24
_
X
(no longer uniform)
Chap 7-18
Sampling Distributions
Sample Mean Example
Summary Measures of this Sampling Distribution:
μX
X


N
σX 

i
18  19  21    24

 21
16
2
(
X

μ
)
i

X
N
(18 - 21)2  (19 - 21)2    (24 - 21)2
 1.58
16
Chap 7-19
Sampling Distributions
Sample Mean Example
Population
N=4
μ  21
Sample Means Distribution
n=2
σ  2.236
μ X  21
_
P(X)
P(X)
.3
.3
.2
.2
.1
.1
0
0
18
A
20
B
22
C
24
D
X
σ X  1.58
18 19 20 21 22 23 24
_
X
Chap 7-20
Sampling Distributions
Standard Error
 Different samples of the same size from the same population
will yield different sample means.
 A measure of the variability in the mean from sample to
sample is given by the Standard Error of the Mean:
σ
σX 
n
 Note that the standard error of the mean decreases as the
sample size increases.
Chap 7-21
Sampling Distributions
Standard Error: Normal Pop.
 If a population is normal with mean μ and standard
deviation σ, the sampling distribution of the mean is also
normally distributed with
μX  μ
and
σ
σX 
n
(This assumes that sampling is with replacement or sampling
is without replacement from an infinite population)
Chap 7-22
Sampling Distributions
Z Value: Normal Pop.
 Z-value for the sampling distribution of the sample mean:
Z
where:
(X  μ X )
σX
(X  μ)

σ
n
X = sample mean
μ = population mean
σ = population standard deviation
n = sample size
Chap 7-23
Sampling Distributions
Properties: Normal Pop.
μx  μ
(i.e. x is
unbiased )
Normal Population
Distribution
μ
x
Normal Sampling
Distribution
(has the same mean)
μx
x
Chap 7-24
Sampling Distributions
Properties: Normal Pop.
 For sampling with replacement:


As n increases,
σ x decreases
Larger sample
size
Smaller sample
size
μ
x
Chap 7-25
Sampling Distributions
Non-Normal Population
 The Central Limit Theorem states that as the sample
size (that is, the number of values in each sample)
gets large enough, the sampling distribution of the
mean is approximately normally distributed. This is
true regardless of the shape of the distribution of the
individual values in the population.
 Measures of the sampling distribution:
μx  μ
σ
σx 
n
Chap 7-26
Sampling Distributions
Non-Normal Population
Population Distribution
μ
x
Sampling Distribution
(becomes normal as n increases)
Larger
sample
size
Smaller sample size
μx
x
Chap 7-27
Sampling Distributions
Non-Normal Population
 For most distributions, n > 30 will give a
sampling distribution that is nearly normal
 For fairly symmetric distributions, n > 15 will
give a sampling distribution that is nearly
normal
 For normal population distributions, the
sampling distribution of the mean is always
normally distributed
Chap 7-28
Sampling Distributions
Example
 Suppose a population has mean μ = 8 and standard
deviation σ = 3. Suppose a random sample of size n = 36
is selected.
 What is the probability that the sample mean is between
7.75 and 8.25?
 Even if the population is not normally distributed, the
central limit theorem can be used (n > 30).
 So, the distribution of the sample mean is approximately
normal with
μx  8
σx 
σ
3

 0.5
n
36
Chap 7-29
PROBLEMS
Chap 7-30
Chap 7-31
Chap 7-32
2
Chap 7-33
Chap 7-34
3
 How is the std. error of the mean affected by
increasing the sample size from 25 to 100
boxes?
Chap 7-35

Chap 7-36
4
 If you select a sample of 10 boxes, what is
the probability that the sample mean is below
365 grams?
Chap 7-37

Chap 7-38
Sampling Distributions
Example
First, compute Z values for both 7.75 and 8.25.
7.75 - 8
 0 . 5
3
36
8.25 - 8
Z
 0 .5
3
36
Z
Now, use the cumulative normal table to compute
the correct probability.
P(7.75  μ X  8.25)  P(-0.5  Z  0.5)  0.3830
Chap 7-39
Sampling Distributions
Example
Population
Distribution
= 2(.5000-.3085)
= 2(.1915)
μ8
= 0.3830
X
Sample
Sampling
Distribution
7.75
Standardized Normal
Distribution
μX  8
8.25
x
-0.5
μz  0
0.5
Z
Chap 7-40
Sampling Distributions
The Proportion
 The proportion of the population having some
characteristic is denoted π.
 Sample proportion ( p ) provides an estimate of π:
p
X
number of items in the sample having the characteri stic of interest

n
sample size
 0≤ p≤1
 p has a binomial distribution
(assuming sampling with replacement from a finite population or without
replacement from an infinite population)
Chap 7-41
Sampling Distributions
The Proportion
 Standard error for the proportion:
σp 
 (1   )
n
 Z value for the proportion:
p 
Z

σp
p 
 (1   )
n
Chap 7-42
Sampling Distributions
The Proportion: Example
 If the true proportion of voters who support
Proposition A is π = .4, what is the probability that
a sample of size 200 yields a sample proportion
between .40 and .45?
 In other words, if π = .4 and n = 200, what is
P(.40 ≤ p ≤ .45) ?
Chap 7-43
Sampling Distributions
The Proportion: Example
Find σ p:
Convert to
standardized
normal:
σp 
 (1   )
n
.4(1  .4)

 .03464
200
.45  .40 
 .40  .40
P(.40  p  .45)  P
Z

.03464 
 .03464
 P(0  Z  1.44)
Chap 7-44
Sampling Distributions
The Proportion: Example
Use cumulative normal table:
P(0 ≤ Z ≤ 1.44) = P(Z ≤ 1.44) – 0.5 = .4251
Standardized
Normal Distribution
Sampling Distribution
.4251
Standardize
.40
.45
p
0
1.44
Z
Chap 7-45
Chapter Summary
In this chapter, we have
 Described different types of samples
 Examined survey worthiness and types of survey
errors
 Introduced sampling distributions
 Described the sampling distribution of the mean
 For normal populations
 Using the Central Limit Theorem
 Described the sampling distribution of a proportion
 Calculated probabilities using sampling distributions
Chap 7-46
Download