Section 4.4 Sampling Distributions

advertisement
Section 4.4
Sampling Distribution Models
and the Central Limit Theorem
Transition from Data Analysis and
Probability to Statistics
Probability:

From population to
sample (deduction)
Statistics:
 From sample to the
population (induction)
Sampling Distributions
Population parameter: a numerical
descriptive measure of a population.
(for example:  p (a population proportion);
the numerical value of a population
parameter is usually not known)
Example:  =mean height of all NCSU students
p=proportion of Raleigh residents who favor
stricter gun control laws
 Sample statistic: a numerical descriptive
measure calculated from sample data.
(e.g, x, s, p (sample proportion))

Parameters; Statistics

In real life parameters of populations are
unknown and unknowable.
– For example, the mean height of US adult
(18+) men is unknown and unknowable


Rather than investigating the whole
population, we take a sample, calculate a
statistic related to the parameter of
interest, and make an inference.
The sampling distribution of the statistic
is the tool that tells us how close the value
of the statistic is to the unknown value of
the parameter.
DEF: Sampling Distribution

The sampling distribution of a sample
statistic calculated from a sample of n
measurements is the probability
distribution of values taken by the
statistic in all possible samples of size n
taken from the same population.
Based on all possible
samples of size n.


In some cases the sampling distribution can
be determined exactly.
In other cases it must be approximated by
using a computer to draw some of the
possible samples of size n and drawing a
histogram.
2
Pop size = 5, n = 2, # of poss samples: 5 = 25
8
Pop size: 6; n = 8; # of poss. samples: 6 =
1,679,616
Pop size: 500,000, n = 10; # of samples:
10
500,000
Sampling distribution of p, the
sample proportion; an example



If a coin is fair the probability of a head on
any toss of the coin is p = 0.5.
Imagine tossing this fair coin 5 times and
calculating the proportion p of the 5 tosses
that result in heads (note that p = x/5, where x
is the number of heads in 5 tosses).
Objective: determine the sampling
distribution of p, the proportion of heads in 5
tosses of a fair coin.
Sampling distribution of p (cont.)
Step 1: The possible values of p are 0/5=0,
1/5=.2, 2/5=.4, 3/5=.6, 4/5=.8, 5/5=1
Binomial
Probabilities
p(x) for n=5,
p = 0.5

x
0
1
2
3
4
5
p(x)
0.03125
0.15625
0.3125
0.3125
0.15625
0.03125
p
P(p)
0
.2
.4
.03125 .15625 .3125
.6
.8
1
.3125
.15625 .03125
The above table is the probability distribution of
p, the proportion of heads in 5 tosses of a fair
coin.
Sampling distribution of p (cont.)
p
P(p)
0
.2
.4
.03125 .15625 .3125
.6
.8
1
.3125
.15625 .03125
Sampling distribution of p (cont.)
p
P(p)


0
.2
.4
.6
.8
1
.03125
.15625
.3125
.3125
.15625
.03125
E(p) =0*.03125+ 0.2*.15625+ 0.4*.3125
+0.6*.3125+ 0.8*.15625+ 1*.03125 = 0.5 = p
(the prob of heads)
Var(p) = (0  .5)  .03125  (.2  .5)  .15625  (.4  .5)  .3125
2
2
2
 (.6  .5)  .3125  (.8  .5)  .15625  (1  .5)  .03125
2
2
2
= .05


So SD(p) = sqrt(.05) = .2236
NOTE THAT SD(p) = pq = .5  .5 = .5 = .2236
n
5
5
Expected Value and Standard
Deviation of the Sampling
Distribution of p


E(p) = p
SD(p) =
p(1  p)
n
where p is the “success” probability in the
sampled population and n is the sample
size
Shape of Sampling Distribution of
p

The sampling distribution of p is
approximately normal when the sample
size n is large enough. n large enough
means np ≥ 10 and n(1-p) ≥ 10
Shape of Sampling Distribution
of p
Population Distribution,
p=.65
Population, p = .65
0.7
0.65
0.6
0.5
0.4
0.3
0.35
0.2
0.1
0
0
1
Sampling distribution of p
for samples of size n
Example
8% of American Caucasian male
population is color blind.
 Use computer to simulate random
samples of size n = 1000

Histogram of phat's from Simulated Samples (2000
independent samples, each of size n=1000 men)
300
200
100
9
7
0.
10
phat
0.
09
1
0.
09
3
0.
08
5
0.
07
7
0.
06
9
0.
05
1
0
0.
05
# of Samples
400
The sampling distribution model for a
sample proportion p
Provided that the sampled values are independent and the
sample size n is large enough, the sampling distribution of
p is modeled by a normal distribution with E(p) = p and
standard deviation SD(p) =
pq
n
, that is

pq 
pˆ ~ N  p,

n 

where q = 1 – p and where n large enough means np>=10
and nq>=10
The Central Limit Theorem will be a formal statement of
this fact.
Example: binge drinking by
college students




Study by Harvard School of Public Health:
44% of college students binge drink.
244 college students surveyed; 36% admitted
to binge drinking in the past week
Assume the value 0.44 given in the study is
the proportion p of college students that
binge drink; that is 0.44 is the population
proportion p
Compute the probability that in a sample of
244 students, 36% or less have engaged in
binge drinking.
Example: binge drinking by
college students (cont.)


Let p be the proportion in a sample of 244
that engage in binge drinking.
We want to compute P ( pˆ  .36)
pq


.44*.56
E(p) = p = .44; SD(p) = n = 244 = .032
Since np = 244*.44 = 107.36 and nq =
244*.56 = 136.64 are both greater than 10,
we can model the sampling distribution of p
with a normal distribution, so …
Example: binge drinking by
college students (cont.)
pˆ ~ N (.44,.032)
pˆ  .44 .36  .44 

So P ( pˆ  .36) = P 


.032 
 .032
= P ( z  2.5) = .0062
Example: snapchat
by college students




recent scientifically valid survey : 77% of college
students use snapchat.
1136 college students surveyed; 75% reported that
they use snapchat.
Assume the value 0.77 given in the survey is the
proportion p of college students that use snapchat;
that is 0.77 is the population proportion p
Compute the probability that in a sample of 1136
students, 75% or less use snapchat.
Example: snapchat by college
students (cont.)


Let p be the proportion in a sample of 1136
that use snapchat.
We want to compute P ( pˆ  .75)
pq


=
.77 *.23
= .0125
E(p) = p = .77; SD(p) = n
1136
Since np = 1136*.77 = 874.72 and nq =
1136*.23 = 261.28 are both greater than 10,
we can model the sampling distribution of p
with a normal distribution, so …
Example: snapchat by college
students (cont.)
pˆ ~ N (.77,.0125)
pˆ  .75 .75  .77 

So P ( pˆ  .75) = P 


.0125 
 .0125
= P ( z  1.6) = .0548
Another Population Parameter of
Frequent Interest: the Population
Mean µ
 To
estimate the unknown value of
µ, the sample mean x is often used.
 We need to examine the Sampling
Distribution of the Sample Mean x
(the probability distribution of all
possible values of x based on a
sample of size n).
Example
Professor Stickler has a large statistics class
of over 300 students. He asked them the
ages of their cars and obtained the following
probability distribution:
x
2
3
4
5
6
7
8
p(x) 1/14 1/14 2/14 2/14 2/14 3/14 3/14

SRS n=2 is to be drawn from pop.
 Find the sampling distribution of the
sample mean x for samples of size n =
2.

Solution

7 possible ages (ages 2 through 8)

Total of 72=49 possible samples of size 2

All 49 possible samples with the
corresponding sample mean are on p. 47
in the coursepack and on the next slide.
All 49 possible samples of size n = 2
Population: ages of
cars and their
distribution
x
p(x)
2
1/14
3
1/14
4
2/14
5
2/14
6
2/14
7
3/14
8
3/14
Sample 2,2 2,4 2,6 2,8 2,5 2,3 2,7 4,2 4,4 4,6 4,8 4,5 4,3 4,7 6,2 6,4 6,6
xbar 2 3 4 5 3.5 2.5 4.5 3 4 5 6 4.5 3.5 5.5 4 5 6
Prob
1
196
2
196
2
196
3
196
2
196
1
196
3
196
2
196
4
196
4
196
6
196
4
196
2
196
6
196
2
196
4
196
Sample 6,8 6,5 6,3 6,7 8,2 8,4 8,6 8,8 8,5 8,3 8,7 5,2 5,4 5,6 5,8 5,5
xbar 7 5.5 4.5 6.5 5 6 7 8 6.5 5.5 7.5 3.5 4.5 5.5 6.5 5
Prob
6
196
4
196
2
196
6
196
3
196
6
196
6
196
9
196
6
196
3
196
9
196
2
196
4
196
4
196
6
196
4
196
Sample 5,3 5,7 3,2 3,4 3,6 3,8 3,5 3,3 3,7 7,2 7,4 7,6 7,8 7,5 7,3 7,7
xbar 4 6 2.5 3.5 4.5 5.5 4 3 5 4.5 5.5 6.5 7.5 6 5 7
Prob
2
196
6
196
1
196
2
196
2
196
3
196
2
196
1
196
3
196
3
196
6
196
6
196
9
196
6
196
3
196
9
196
4
196
Probability Distribution of the Sample
Mean Age of 2 Cars
Sample 2,2 2,4 2,6 2,8 2,5 2,3 2,7 4,2 4,4 4,6 4,8 4,5 4,3 4,7 6,2 6,4 6,6
xbar 2 3 4 5 3.5 2.5 4.5 3 4 5 6 4.5 3.5 5.5 4 5 6
Prob
1
196
2
196
2
196
3
196
2
196
1
196
3
196
2
196
4
196
4
196
6
196
4
196
2
196
6
196
2
196
4
196
4
196
Sample 6,8 6,5 6,3 6,7 8,2 8,4 8,6 8,8 8,5 8,3 8,7 5,2 5,4 5,6 5,8 5,5
xbar 7 5.5 4.5 6.5 5 6 7 8 6.5 5.5 7.5 3.5 4.5 5.5 6.5 5
Prob
6
196
4
196
2
196
6
196
3
196
6
196
6
196
9
196
6
196
3
196
9
196
2
196
4
196
4
196
6
196
4
196
Sample 5,3 5,7 3,2 3,4 3,6 3,8 3,5 3,3 3,7 7,2 7,4 7,6 7,8 7,5 7,3 7,7
xbar 4 6 2.5 3.5 4.5 5.5 4 3 5 4.5 5.5 6.5 7.5 6 5 7
Prob
x
2
2.5
2
196
3
6
196
1
196
3.5
2
196
2
196
4
3
196
2
196
1
196
4.5
3
196
3
196
5
6
196
6
196
9
196
5.5
6
196
3
196
9
196
6
6.5
7
7.5
8
p(x) 1/196 2/196 5/196 8/196 12/196 18/196 24/196 26/196 28/196 24/196 21/196 18/196 9/196
Solution (cont.)

Probability distribution of x:
x 2 2.5
p(x) 1/196 2/196
3
3.5
4
4.5
5
5.5
6
6.5
7
7.5
8
5/196 8/196 12/196 18/196 24/196 26/196 28/196 24/196 21/196 18/196 1/196
This is the sampling distribution of x because it
specifies the probability associated with each
possible value of x
 From the sampling distribution above
P(4  x  6) = p(4)+p(4.5)+p(5)+p(5.5)+p(6)

= 12/196 + 18/196 + 24/196 + 26/196 + 28/196 = 108/196
Expected Value and Standard
Deviation of the Sampling
Distribution of x
Example (cont.)
Population probability dist.
x
2
3
4
5
6
7
8
p(x) 1/14 1/14 2/14 2/14 2/14 3/14 3/14


Sampling dist. of x
x
p(x)
2
2.5
3
3.5
4
4.5
5
5.5
6
6.5
7
7.5
8
1/196 2/196 5/196 8/196 12/196 18/196 24/196 26/196 28/196 24/196 21/196 18/196 1/196
Population probability dist.
x
2
3
4
5
6
7
8
p(x) 1/14 1/14 2/14 2/14 2/14 3/14 3/14
E(X)=2(1/14)+3(1/14)+4(2/14)+ … +8(3/14)=5.714
Sampling dist. of x
Population mean E(X)= = 5.714
x
2 2.5 3 3.5
p(x) 1/196 2/196 5/196 8/196
4.5
4
5
5.5
6
6.5
7
7.5
8
12/196 18/196 24/196 26/196 28/196 24/196 21/196 18/196 1/196
E(X)=2(1/196)+2.5(2/196)+3(5/196)+3.5(8/196)+4(12/196)+4.5(18/196)+5(24/196)
+5.5(26/196)+6(28/196)+6.5(24/196)+7(21/196)+7.5(18/196)+8(1/196) = 5.714
Mean of sampling distribution of x: E(X) = 5.714
Example (cont.)
Population from which sample is selected:
 = E ( X ) = 2( 141 )  3( 141 )  4( 142 ) 
 8  143  = 5.714
 2 = Var ( X ) = 3.4898
 = SD( X ) = Var ( X ) = 3.4898 = 1.8681

Sampling dist. of X :
1
2
E ( X ) = 2( 196
)  2.5( 196
)
9
 8( 196
) = 5.714
3.4898 Var ( X )
=
2
2
Var ( X ) SD( X ) 1.8681
SD( X ) = Var ( X ) =
=
=
= 1.3209
2
2
2
Var ( X ) =1.7449 =
IMPORTANT
Numerical Summaries of the Sampling Distribution of X are
Related to the Numerical Summaries of the Population X from
Which the Sample is Selected
E ( X ) = E ( X ) (the mean of the sampling distribution of X is always
equal to the mean of the population from which the sample is selected)

Var ( X )
Var ( X ) =
n
Var ( X ) SD( X )
SD( X ) = Var ( X ) =
=
n
n
the standard deviation of the sampling distribution of X is always
equal to the standard deviation of the population from which the sample
is selected, divided by the square root of the sample size n
Sampling Distribution of the
Sample Mean X: Example

An example
– A fair 6-sided die is thrown; let X represent
the number of dots showing on the upper
face.
Population mean :
– The probability distribution  = E(X) = 1(1/6) +2(1/6)
+ 3(1/6) +……… = 3.5.
of X is
2
x
1 2 3 4 5 6
p(x) 1/6 1/6 1/6 1/6 1/6 1/6
Population variance 
2 =V(X) = (1-3.5)2(1/6)+
(2-3.5)2(1/6)+ ………
………. = 2.92
Suppose we want to estimate  from the
mean x of a sample of size n = 2.
 What is the sampling distribution of x in
this situation?

Sample
1
2
3
4
5
6
7
8
9
10
11
12
1,1
1,2
1,3
1,4
1,5
1,6
2,1
2,2
2,3
2,4
2,5
2,6
Mean Sample
Mean
1
13
3,1
2
1.5
14
3,2
2.5
2
15
3,3
3
2.5
16
3,4
3.5
3
17
3,5
4
3.5
18
3,6
4.5
1.5
19
4,1
2.5
2
20
4,2
3
2.5
21
4,3
3.5
3
22
4,4
4
3.5
23
4,5
4.5
4
24
4,6
5
Sample
25
26
27
28
29
30
31
32
33
34
35
36
Mean
5,1
5,2
5,3
5,4
5,5
5,6
6,1
6,2
6,3
6,4
6,5
6,6
3
3.5
4
4.5
5
5.5
3.5
4
4.5
5
5.5
6
Sample
1
2
3
4
5
6
7
8
9
10
11
12
1,1
1,2
1,3
1,4
1,5
1,6
2,1
2,2
2,3
2,4
2,5
2,6
Mean Sample
Mean
1
13
3,1
2
1.5
14
3,2
2.5
2
15
3,3
3
2.5
16
3,4
3.5
3
17
3,5
4
3.5
18
3,6
4.5
1.5
19
4,1
2.5
2
20
4,2
3
2.5
21
4,3
3.5
3
22
4,4
4
3.5
23
4,5
4.5
4
24
4,6
5
Sample
25
26
27
28
29
30
31
32
33
34
35
36
Mean
5,1
5,2
5,3
5,4
5,5
5,6
6,1
6,2
6,3
6,4
6,5
6,6
3
3.5
4
4.5
5
5.5
3.5
4
4.5
5
5.5
6
Var ( X )
Note : E ( X ) = E ( X ) and Var ( X ) =
2
E( x) =1.0(1/36)+
1.5(2/36)+….=3.5
6/36
5/36
V(X) = (1.0-3.5)2(1/36)+
(1.5-3.5)2(2/36)... = 1.46
4/36
3/36
2/36
1/36
1
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
5.5 6.0
x
n=5
E ( X ) = 3.5
Var ( X ) = .5833 ( =
Var ( X )
5
n = 10
) E ( X ) = 3.5
Var ( X ) = .2917 ( =
Var ( X )
10
n = 25
) E ( X ) = 3.5
Var ( X ) = .1167 ( =
1
Var ( X )
25
6
Notice that Var ( X ) is smaller
1
than Var(X). The larger the sample
size the smaller is Var ( X ) . Therefore,
x tends to fall closer to , as the
sample size increases.
6
1
6
)
The variance of the sample mean is smaller
than the variance of the population.
Mean = 1.5 Mean = 2. Mean = 2.5
Population
Let us take samples
of two observations
1.5
2.5
22
3
1.5
2.5
22
1.5
2.5
1.5
2
2.5
1.5
2.5
Compare1.5
the variability
population
2 of the
2.5
1.5
2.5
to the variability
of 22the sample
mean.
1.5
2.5
1.5
2.5
2
1.5
2.5
1.5
2
2.5
1.5
2
2.5
1.5
2
2.5
1
Also,
Expected value of the population = (1 + 2 + 3)/3 = 2
Expected value of the sample mean = (1.5 + 2 + 2.5)/3 = 2
IMPORTANT
Numerical Summaries of the Sampling Distribution of X are
Related to the Numerical Summaries of the Population X from
Which the Sample is Selected
E ( X ) = E ( X ) (the mean of the sampling distribution of X is always
equal to the mean of the population from which the sample is selected)

Var ( X )
Var ( X ) =
n
Var ( X ) SD( X )
SD( X ) = Var ( X ) =
=
n
n
the standard deviation of the sampling distribution of X is always
equal to the standard deviation of the population from which the sample
is selected, divided by the square root of the sample size n
Unbiased
l Confidence
l Precision
µ
The central tendency is down the center
BUS 350 - Topic 6.1
Handout 6.1, Page 1 6.1 - 14
Unbiased
Biased
µ
Biased
µ
µ
The central tendency is down the center
BUS 350 - Topic 6.1
Handout 6.1, Page 2
6.1 - 15
Consequences
1. E ( x ) =  . This is why we use x to estimate an
unknown population mean . The sampling
dist. of x is "centered" at the parameter we are
trying to estimate.
2. SD( x ) = SD (nx ) ; the standard deviation of x is
smaller than SD( x), the stand. dev. of the population from which the sample is taken. The
values of x will cluster tightly around 
when n is large.
A Billion Dollar Mistake





“Conventional” wisdom: smaller schools better
than larger schools
Late 90’s, Gates Foundation, Annenberg
Foundation, Carnegie Foundation
Among the 50 top-scoring Pennsylvania
elementary schools 6 (12%) were from the
smallest 3% of the schools
But …, they didn’t notice …
Among the 50 lowest-scoring Pennsylvania
elementary schools 9 (18%) were from the
smallest 3% of the schools
A Billion Dollar
Mistake (cont.)
Smaller schools have (by definition)
smaller n’s.
SD ( x )
 When n is small, SD(x) =
n is larger
 That is, the sampling distributions of
small school mean scores have larger
SD’s
 http://www.forbes.com/2008/11/18/gate
s-foundation-schools-opedcx_dr_1119ravitch.html

We Know More!

We know 2 parameters of the sampling
distribution of x :
E(x) = μ
SD(x)
SD(x) =
n
The Central Limit Theorem tells
us about the shape of the distribution of x
when the sample size n is sufficiently large.
Download