Lecture Notes Number 8 - Department of Statistics and Probability

advertisement
UNIT 3
• YOUR FINAL
EXAMINATION
STUDY MATERIAL
STARTS FROM HERE
Copyright ©2011 Brooks/Cole, Cengage Learning
1
UNIT 3
Understanding
Sampling
Distributions:
Statistics as
Random Variables
Copyright ©2011 Brooks/Cole, Cengage Learning
2
EXPECTATIONS
• SAMPLING DISTRIBUTION FOR ONE SAMPLE
PROPORTION;
• SAMPLING DISTRIBUTION FOR THE
DIFFERENCE BETWEEN TWO SAMPLE
PROPORTIONS;
• SAMPLING DISTRIBUTION FOR ONE SAMPLE
MEAN;
• SAMPLING DISTRIBUTION FOR THE
DIFFERENCE BETWEEN TWO SAMPLE MEANS;
Copyright ©2011 Brooks/Cole, Cengage Learning
3
Parameters, Statistics,
and Statistical Inference
A statistic is a numerical value computed from a
sample. Its value may differ for different samples.
e.g. sample mean x , sample standard deviation s,
and sample proportion p̂.
A parameter is a numerical value associated with
a population. Considered fixed and unchanging.
e.g. population mean m, population standard
deviation s, and population proportion p.
Copyright ©2011 Brooks/Cole, Cengage Learning
4
Statistical Inference
Statistical Inference: making conclusions about
population parameters on basis of sample statistics.
Two most common procedures:
Confidence intervals: an interval of values that the
researcher is fairly sure will cover the true,
unknown value of the population parameter.
Hypothesis tests: uses sample data to attempt to
reject a hypothesis about the population.
Copyright ©2011 Brooks/Cole, Cengage Learning
5
Independent Samples
• Two samples are called independent
samples when the measurements in one
sample are not related to the
measurements in the other sample.
Copyright ©2011 Brooks/Cole, Cengage Learning
6
Sampling Distribution For One Sample
Proportion
PROBLEM FORMULATION: SUPPOSE THAT p IS AN
UNKNOWN PROPORTION OF ELEMENTS OF A
CERTAIN TYPE S IN A POPULATION.
EXAMPLES
• PROPORTION OF LEFT - HANDED PEOPLE;
• PROPORTION OF HIGH SCHOOL STUDENTS
WHO ARE FAILING A READING TEST;
• PROPORTION OF VOTERS WHO WILL VOTE
FOR MR. X.
Copyright ©2011 Brooks/Cole, Cengage Learning
7
More Examples
Estimating the proportion falling into a
category of a categorical variable.
Example research questions:
What proportion of American adults believe there is
extraterrestrial life? In what proportion of British
marriages is the wife taller than her husband?
Population parameter: p = proportion in the population
falling into that category.
Sample estimate: p̂ = proportion in the sample falling
into that category.
Copyright ©2011 Brooks/Cole, Cengage Learning
8
Estimation of p
• TO ESTIMATE p, WE SELECT A SIMPLE
RANDOM SAMPLE (SRS), OF SIZE SAY, n = 1000,
AND COMPUTE THE SAMPLE PROPORTION.
• SUPPOSE THE NUMBER OF THE TYPE WE ARE
INTERESTED IN, IN THIS SAMPLE OF n = 1000
IS x = 437. THEN THE SAMPLE PROPORTION
IS COMPUTED USING THE FORMULA
Copyright ©2011 Brooks/Cole, Cengage Learning
9
Estimation of p
x
ˆ 
p
n
Copyright ©2011 Brooks/Cole, Cengage Learning
10
WHAT IS THE ERROR OF
ESTIMATION?
• THAT IS, WHAT IS
ˆ  p ?
p
•
WHAT MODEL CAN HELP US FIND THE
BEST ESTIMATE OF THE TRUE PROPORTION
OF p?
• LET’S START THE ANALYSIS BY FIRST
ANSWERING THE SECOND QUESTION.
Copyright ©2011 Brooks/Cole, Cengage Learning
11
THE APPROACH
• SUPPOSE THAT WE TAKE A SECOND SAMPLE
OF SIZE 1000 AND COMPUTE P(HAT); CLEARLY,
THE NEW ESTIMATE WILL BE DIFFERENT
FROM 0.437. NOW, TAKE A THIRD SAMPLE, A
FOURTH SAMPLE, UNTIL THE TWO
THOUSANDTH (2000 –TH) SAMPLE, EACH OF
SIZE 1000. IT IS OBVIOUS THAT WE WILL
LIKELY OBTAIN TWO THOUSAND DIFFERENT
P(HATS) AS ILLUSTRATED IN THE TABLE
BELOW.
Copyright ©2011 Brooks/Cole, Cengage Learning
12
Sample Distribution Table
Copyright ©2011 Brooks/Cole, Cengage Learning
13
Sampling Distribution For One
Sample Proportion
Statistics as Random Variables
Each new sample taken 
value of the sample statistic will change.
The distribution of possible values of a statistic for
repeated samples of the same size from a population
is called the sampling distribution of the statistic.
Many statistics of interest have sampling distributions
that are approximately normal distributions
Copyright ©2011 Brooks/Cole, Cengage Learning
14
Many Possible Samples
Four possible random samples of 25 people:
Sample 1: X =12, proportion with gene =12/25 = 0.48 or 48%.
Sample 2: X = 9, proportion with gene = 9/25 = 0.36 or 36%.
Sample 3: X = 10, proportion with gene = 10/25 = 0.40 or 40%.
Sample 4: X = 7, proportion with gene = 7/25 = 0.28 or 28%.
Note:
• Each sample gave a different answer, which did not
always match the population value of p.
• Although we cannot determine whether one sample will
accurately reflect the population, statisticians have
determined what to expect for most possible samples.
Copyright ©2011 Brooks/Cole, Cengage Learning
15
Shape of Histogram of the p(hats)
# of
samples
p
Copyright ©2011 Brooks/Cole, Cengage Learning
p(hats)
16
REMARKS ON OBSERVING THE
HISTOGRAM
• THE HISTOGRAM ABOVE IS AN EXAMPLE OF
WHAT WE WOULD GET IF WE COULD SEE ALL
THE PROPORTIONS FROM ALL POSSIBLE
SAMPLES. THAT DISTRIBUTION HAS A SPECIAL
NAME. IT IS CALLED THE SAMPLING
DISTRIBUTION OF THE PROPORTIONS.
• OBSERVE THAT THE HISTOGRAM IS
UNIMODAL, ROUGHLY SYMMETRIC, AND IT’S
CENTERED AT p.
Copyright ©2011 Brooks/Cole, Cengage Learning
17
WHAT THEN IS THE APPROPRIATE
PROBABILITY MODEL?
• ANSWER: IT IS AMAZING AND FORTUNATE
THAT A NORMAL MODEL IS JUST THE RIGHT
ONE FOR THE HISTOGRAMS OF SAMPLE
PROPORTIONS.
• HOW GOOD IS THE NORMAL MODEL?
– IT IS GOOD IF THE FOLLOWING
ASSUMPTIONS AND CONDITIONS HOLD.
Copyright ©2011 Brooks/Cole, Cengage Learning
18
ASSUMPTIONS AND CONDITIONS
• ASSUMPTIONS
• INDEPENDENCE ASSUMPTION: THE SAMPLED
VALUES MUST BE INDEPENDENT OF EACH
OTHER.
• SAMPLE SIZE ASSUMPTION: THE SAMPLE SIZE,
n, MUST BE LARGE ENOUGH
• REMARK: ASSUMPTIONS ARE HARD – OFTEN
IMPOSSIBLE TO CHECK. THAT’S WHY WE
ASSUME THEM. GLADLY, SOME CONDITIONS
MAY PROVIDE INFORMATION ABOUT THE
ASSUMPTIONS.
Copyright ©2011 Brooks/Cole, Cengage Learning
19
CONDITIONS
• RANDOMIZATION CONDITION: THE DATA VALUES MUST
BE SAMPLED RANDOMLY. IF POSSIBLE, USE SIMPLE
RANDOM SAMPLING DESIGN TO SAMPLE THE
POPULATION OF INTEREST.
• 10% CONDITION: THE SAMPLE SIZE, n, MUST BE NO
LARGER THAN 10% OF THE POPULATION OF INTEREST.
• SUCCESS/FAILURE CONDITION: THE SAMPLE SIZE HAS
TO BE BIG ENOUGH SO THAT WE EXPECT AT LEAST 10
SUCCESSES AND AT LEAST 10 FAILLURES. THAT IS,
np  10
nq  10
Copyright ©2011 Brooks/Cole, Cengage Learning
( SUCCESS)
( FAILLURE )
20
Sampling Distribution
for a Sample Proportion (The Central Limit
Theorem)
Let p = population proportion of interest
or binomial probability of success.
Let p̂ = sample proportion or proportion of successes.
If numerous random samples or repetitions of the same size n
are taken, the distribution of possible values of p̂ is
approximately a normal curve distribution with
• Mean = p
p (1  p )
• Standard deviation = s.d.( p̂ ) =
n
This approximate distribution is sampling distribution of p̂ .
Copyright ©2011 Brooks/Cole, Cengage Learning
21
Estimating the Population Proportion
from a Single Sample Proportion
In practice, we don’t know the true population proportion p,
so we cannot compute the standard deviation of p̂ ,
s.d.( p̂ ) =
p (1  p )
n
.
In practice, we only take one random sample, so we only have
one sample proportion p̂ . Replacing p with p̂ in the standard
deviation expression gives us an estimate that is called the
standard error of p̂ .
s.e.( p̂ ) =
pˆ (1  pˆ )
n
.
If p̂ = 0.39 and n = 2400, then the standard error is 0.01. So
the true proportion who support the candidate is almost surely
between 0.39 – 3(0.01) = 0.36 and 0.39 + 3(0.01) = 0.42.
Copyright ©2011 Brooks/Cole, Cengage Learning
22
Standard Deviation and
Standard Error of a Statistic
• Standard deviation of a sampling distribution
measures the variation among possible values of the
sample statistic over all possible random samples.
We include the name of the statistic being studied,
e.g. the standard deviation of the mean.
• Standard error describes the estimated value of the
standard deviation of a statistic. We include the
name of the statistic, e.g. the standard error of the
mean.
Copyright ©2011 Brooks/Cole, Cengage Learning
23
More Examples for which Rule Applies
• Election Polls: to estimate proportion who favor a
candidate; units = all voters.
• Television Ratings: to estimate proportion of
households watching TV program; units = all households
with TV.
• Consumer Preferences: to estimate proportion of
consumers who prefer new recipe compared with old;
units = all consumers.
• Testing ESP: to estimate probability a person can
successfully guess which of 5 symbols on a hidden card;
repeatable situation = a guess.
Copyright ©2011 Brooks/Cole, Cengage Learning
24
Example:
Possible Sample Proportions
Favoring a Candidate
Suppose 40% all voters favor Candidate C. Pollsters take a
sample of n = 2400 voters. Rule states the sample proportion
who favor X will have approximately a normal distribution with
mean = p = 0.4 and s.d.( p̂ ) =
p (1  p )
n

0.4(1  0.4)
2400
 0.01
Histogram at right
shows sample
proportions resulting
from simulating this
situation 400 times.
Copyright ©2011 Brooks/Cole, Cengage Learning
25
EXAMPLE FROM PRACTICE SHEET
• ASSUME THAT 30% OF STUDENTS AT A UNIVERSITY
WEAR CONTACT LENSES
• (A) WE RANDOMLY PICK 100 STUDENTS. LET p̂
REPRESENT THE PROPORTION OF STUDENTS IN THIS
SAMPLE WHO WEAR CONTACTS. WHAT’S THE
APPROPRIATE MODEL FOR THE DISTRIBUTION OF p̂ ?
SPECIFY THE NAME OF THE DISTRIBUTION, THE MEAN,
AND THE STANDARD DEVIATION. BE SURE TO VERIFY
THAT THE CONDITIONS ARE MET.
• (B) WHAT’S THE APPROXIMATE PROBABILITY THAT
MORE THAN ONE THIRD OF THIS SAMPLE WEAR
CONTACTS?
Copyright ©2011 Brooks/Cole, Cengage Learning
26
SOLUTION
Copyright ©2011 Brooks/Cole, Cengage Learning
27
EXAMPLE FROM PRACTICE SHEET
• INFORMATION ON A PACKET OF SEEDS
CLAIMS THAT THE GERMINATION RATE
IS 92%. WHAT’S THE PROBABILITY
THAT MORE THAN 95% OF THE 160
SEEDS IN THE PACKET WILL
GERMINATE? BE SURE TO DISCUSS
YOUR ASSUMPTIONS AND CHECK THE
CONDITIONS THAT SUPPORT YOUR
MODEL.
Copyright ©2011 Brooks/Cole, Cengage Learning
28
SOLUTION
Copyright ©2011 Brooks/Cole, Cengage Learning
29
Sampling Distribution for Difference
in Two Sample Proportions
For the populations:
p1 = population proportion for the first population.
p2 = population proportion for the second population.
Parameter: p1 – p2 = difference in popul proportions.
For the samples:
p̂1 = sample proportion for sample from first popul.
p̂2 = sample proportion for sample from second popul.
Statistic: pˆ1  pˆ 2 = difference in sample proportions.
Copyright ©2011 Brooks/Cole, Cengage Learning
30
Familiar Examples
Estimating the difference between two populations with
regard to the proportion falling into a category of a
qualitative variable.
Example research questions:
1. How much difference is there between the proportions
that would quit smoking if taking the antidepressant
buproprion (Zyban) versus if wearing a nicotine patch?
2. How much difference is there between men who snore
and men who don’t snore with regard to the proportion
who have heart disease?
Population parameter: p1 – p2 = difference between the
two population proportions.
Sample estimate: pˆ1  pˆ 2 = difference between the two
sample proportions.
Copyright ©2011 Brooks/Cole, Cengage Learning
31
Conditions
Sampling distribution of difference in two independent sample
proportions is approximately normal when:
Condition 1: Sample proportions are available for two
independent samples, randomly selected from the two
populations of interest.
Condition 2: All of the quantities n1p1, n1(1 – p1), n2p2, and
n1(1 – p2) are at least 10. These quantities represent the
expected numbers of successes and failures in each sample.
Copyright ©2011 Brooks/Cole, Cengage Learning
32
Sampling Distribution for the
Difference in Two Sample Proportions
Mean = p1 – p2
Standard deviation = s.d.( pˆ1  pˆ 2 )
=
p1 (1  p1 )

p2 (1  p2 )
n1
n2
When we don’t know the populations proportions, we use the
sample proportions, resulting in:
Standard error = s.e.( pˆ1  pˆ 2 ) =
Copyright ©2011 Brooks/Cole, Cengage Learning
pˆ 1 (1  pˆ 1 )
n1

pˆ 2 (1  pˆ 2 )
n2
33
Example: Men, Women, Death Penalty
Suppose 37% of women and 27% of men oppose death penalty,
p1 = .37 and p2 = .27, for a difference p1 – p2 = .37 – .27 = .10
For independent random samples of 1017 women and 885 men,
the sampling distribution of pˆ1  pˆ 2 is approx normal with
mean .10 and standard deviation:
.37(1  .37) .27(1  .27)

 .021
1017
885
Note: 2008 survey gave observed
difference of .36 – .285 = .075,
which is not unusual.
Copyright ©2011 Brooks/Cole, Cengage Learning
34
EXAMPLES FROM PRACTICE SHEET
Copyright ©2011 Brooks/Cole, Cengage Learning
35
Sampling Distribution for One
Sample Mean
• Suppose we want to estimate the mean weight loss
for all who attend clinic for 10 weeks. Suppose
(unknown to us) the distribution of weight loss is
approximately N(8 pounds, 5 pounds).
• We will take a random sample of 25 people from
this population and record for each X = weight
loss.
• We know the value of the sample mean will vary
for different samples of n = 25.
• What do we expect those means to be?
Copyright ©2011 Brooks/Cole, Cengage Learning
36
Familiar Examples
Estimating the mean of a quantitative variable.
Example research questions:
What is the mean time that college students watch TV
per day? What is the mean pulse rate of women?
Population parameter: m = population mean for the
variable
Sample estimate: x = sample mean for the variable
Copyright ©2011 Brooks/Cole, Cengage Learning
37
Many Possible Samples
Four possible random samples of 25 people:
Sample 1: Mean = 8.32 pounds, standard deviation = 4.74 pounds.
Sample 2: Mean = 6.76 pounds, standard deviation = 4.73 pounds.
Sample 3: Mean = 8.48 pounds, standard deviation = 5.27 pounds.
Sample 4: Mean = 7.16 pounds, standard deviation = 5.93 pounds.
Note:
• Each sample gave a different answer, which did not always
match the population mean of 8 pounds.
• Although we cannot determine whether one sample mean will
accurately reflect the population mean, statisticians have
determined what to expect for most possible sample means.
Copyright ©2011 Brooks/Cole, Cengage Learning
38
Example: Mean Hours of Sleep for
College Students
Survey of n = 190 college students.
“How many hours of sleep did you get last night?”
Sample mean = 7.1 hours.
If we repeatedly took
samples of 190 and each
time computed the sample
mean, the histogram of the
resulting sample mean
values would look like the
histogram at the right:
Copyright ©2011 Brooks/Cole, Cengage Learning
39
The Normal Curve Approximation Rule for
Sample Means (The Central Limit Theorem)
Let m = mean for population of interest.
Let s = standard deviation for population of interest.
Let x = sample mean.
If numerous random samples of the same size n are taken, the
distribution of possible values of x is approximately a normal
curve distribution with
• Mean = m
s
• Standard deviation = s.d.( x ) =
n
This approximate distribution is sampling distribution of x .
Copyright ©2011 Brooks/Cole, Cengage Learning
40
Standard Error of the Mean
In practice, the population standard deviation s is rarely
known, so we cannot compute the standard deviation of x ,
s
s.d.( x ) =
.
n
In practice, we only take one random sample, so we only have
the sample mean x and the sample standard deviation s.
Replacing s with s in the standard deviation expression gives
us an estimate that is called the standard error of x .
s
s.e.( x ) =
.
n
For a sample of n = 25 weight losses,
the standard deviation is s = 4.74 pounds.
So the standard error of the mean is 0.948 pounds.
Copyright ©2011 Brooks/Cole, Cengage Learning
41
ASSUMPTIONS AND CONDITIONS
• ASSUMPTIONS
• INDEPENDENCE ASSUMPTION: THE SAMPLED VALUES
MUST BE INDEPENDENT OF EACH OTHER
• SAMPLE SIZE ASSUMPTION: THE SAMPLE SIZE MUST BE
SUFFICIENTLY LARGE.
• REMARK: WE CANNOT CHECK THESE DIRECTLY, BUT
WE CAN THINK ABOUT WHETHER THE INDEPENDENCE
ASSUMPTION IS PLAUSIBLE.
Copyright ©2011 Brooks/Cole, Cengage Learning
42
CONDITIONS
•
RANDOMIZATION CONDITION: THE DATA VALUES MUST BE
SAMPLED RANDOMLY, OR THE CONCEPT OF A SAMPLING
DISTRIBUTION MAKES NO SENSE. IF POSSIBLE, USE SIMPLE
RANDOM SAMPLING DESIGN TO ABTAIN THE SAMPLE.
•
10% CONDITION: WHEN THE SAMPLE IS DRAWN WITHOUT
REPLACEMENT (AS IS USUALLY THE CASE), THE SAMPLE SIZE,
n, SHOULD BE NO MORE THAN 10% OF THE POPULATION.
LARGE ENOUGH SAMPLE CONDITION: IF THE POPULATION IS
UNIMODAL AND SYMMETRIC, EVEN A FAIRLY SMALL SAMPLE
IS OKAY. IF THE POPULATION IS STRONGLY SKEWED, IT CAN
TAKE A PRETTY LARGE SAMPLE TO ALLOW USE OF A
NORMAL MODEL TO DESCRIBE THE DISTRIBUTION OF SAMPLE
MEANS
•
Copyright ©2011 Brooks/Cole, Cengage Learning
43
The Normal Curve Approximation
Rule for Sample Means
Normal Approximation Rule can be applied in two situations:
Situation 1: The population of measurements of interest is
bell-shaped and a random sample of any size is measured.
Situation 2: The population of measurements of interest is
not bell-shaped but a large random sample is measured.
Note: Difficult to get a Random Sample? Researchers usually
willing to use Rule as long as they have a representative sample
with no obvious sources of confounding or bias.
Copyright ©2011 Brooks/Cole, Cengage Learning
44
Examples for which Rule Applies
• Average Weight Loss: to estimate average weight
loss; weight assumed bell-shaped; population = all
current and potential clients.
• Average Age At Death: to estimate average age at
which left-handed adults (over 50) die; ages at death not
bell-shaped so need n  30; population = all left-handed
people who live to be at least 50.
• Average Student Income: to estimate mean monthly
income of students at university who work; incomes not
bell-shaped and outliers likely, so need large random
sample of students; population = all students at university
who work.
Copyright ©2011 Brooks/Cole, Cengage Learning
45
Example:
Hypothetical Mean
Weight Loss
Suppose the distribution of weight loss is approximately
N(8 pounds, 5 pounds) and we will take a random sample
of n = 25 clients. Rule states the sample mean weight loss
will have a normal distribution with
s
5

 1 pound
mean = m = 8 pounds and s.d.( x ) =
n
25
Histogram at right shows
sample means resulting
from simulating this
situation 400 times.
Empirical Rule:
It is almost certain that
the sample mean will be
between 5 and 11 pounds.
Copyright ©2011 Brooks/Cole, Cengage Learning
46
EXAMPLES FROM PRACTICE SHEET
Copyright ©2011 Brooks/Cole, Cengage Learning
47
Increasing the Size of the Sample
Suppose we take n = 100 people instead of just 25.
The standard deviation of the mean would be
s.d.( x ) =
s
n

5
 0.5 pounds.
100
• For samples of n = 25,
sample means are likely to
range between 8 ± 3 pounds
=> 5 to 11 pounds.
• For samples of n = 100,
sample means are likely to
range only between 8 ± 1.5
pounds => 6.5 to 9.5 pounds.
Larger samples tend to result in more accurate estimates
of population values than smaller samples.
Copyright ©2011 Brooks/Cole, Cengage Learning
48
Sampling Distribution for Difference in
Two Sample Means
Let m1 = population mean for first population.
Let m2 = population mean for second population.
Parameter: m1 – m2 = difference in population means.
Let x1 = sample mean for sample from first population.
Let x2 = sample mean for sample from second population.
Statistic: x1  x2 = difference in sample means.
Let s1 = population standard deviation for first population.
Let s2 = population standard deviation for second population.
Let s1 = sample std deviation for sample from first population.
Let s2 = sample std deviation for sample from second population.
Copyright ©2011 Brooks/Cole, Cengage Learning
49
Familiar Examples
Estimating the difference between two populations with
regard to the mean of a quantitative variable.
Example research questions:
1. How much difference is there in average weight loss for
those who diet compared to those who exercise to lose
weight?
2. How much difference is there between the mean foot
lengths of men and women?
Population parameter: m1 – m2 = difference between the
two population means.
Sample estimate: x  x = difference between the two
1
2
sample means.
Copyright ©2011 Brooks/Cole, Cengage Learning
50
Conditions for Sampling Distribution
of x1  x2 to be Approx Normal
An important condition in this situation is that the two samples
must be independent. How?
• Take separate random samples from each of two populations
such as men and women.
• Take a random sample from a population and divide the
sample into two groups based on a categorical variable such
as smoker and nonsmoker.
• Randomly assign participants in a randomized experiment to
two treatment groups such as exercise or diet.
Copyright ©2011 Brooks/Cole, Cengage Learning
51
Conditions for Sampling Distribution
of x1  x2 to be Approx Normal
In addition to independent samples,
one of the following two situations must hold:
Situation 1: The populations of measurements are both
bell-shaped and random samples of any size are measured.
Situation 2: Large random samples are measured from each
population. Arbitrary definition of large is both samples are
at least 30, but extreme outliers or extreme skewness in
either sample may require even larger samples.
Copyright ©2011 Brooks/Cole, Cengage Learning
52
Standard Error of the Mean Difference
Standard deviation of x1  x2 :
s 12 s 22
s.d.( x1  x2 ) =
.

n1 n2
Standard error of x1  x2 :
s.e.( x1  x2 ) =
s12 s22

.
n1 n2
The standard error is used to estimate the standard deviation.
Copyright ©2011 Brooks/Cole, Cengage Learning
53
EXAMPLES FROM PRACTICE SHEET
Copyright ©2011 Brooks/Cole, Cengage Learning
54
Example: Who Are the Speed Demons?
What’s the fastest you’ve ever driven a car? ____ mph.
Mean for 87 males = 107 mph, mean for 102 females = 88 mph.
Is this 19 mph difference large enough to convince of real difference
in populations? Suppose standard deviations for each population of
speeds is known to be 15 mph. The sampling distribution of x1  x2 is:
• Approximately normal
• mean = m1 – m2 = 0 mph
2
2
2
2
• s.d.( x1  x2 ) = s 1  s 2  15  15  2.2
n1
n2
87
102
Note: difference of 19 mph almost
impossible in this scenario. Thus,
true difference in population means
almost surely much greater than 0.
Copyright ©2011 Brooks/Cole, Cengage Learning
55
The Central Limit Theorem (CLT)
The Central Limit Theorem states that if
n is sufficiently large, the sample means of
random samples from a population with
mean m and finite standard deviation s are
approximately normally distributed with
mean m and standard deviations n .
Technical Note:
The mean and standard deviation given in the CLT
hold for any sample size; it is only the “approximately
normal” shape that requires n to be sufficiently large.
Copyright ©2011 Brooks/Cole, Cengage Learning
56
Sampling Distribution for Any Statistic
Every statistic has a sampling distribution,
but the appropriate distribution may not always
be normal, or even approximately bell-shaped.
Construct an approximate sampling distribution
for a statistic by actually taking repeated samples
of the same size from a population and constructing
a relative frequency histogram for the values of the
statistic over the many samples.
Copyright ©2011 Brooks/Cole, Cengage Learning
57
Download