Sampling Distributions and Confidence Interval Estimation

advertisement
Study Guide and Student’s Solutions Manual
1
CHAPTER 5
Sampling Distributions and Confidence Interval Estimation
OBJECTIVES





To understand the concept of sampling distribution
To understand and be able to apply the central limit theorem
To understand the foundation of statistical inference
To understand confidence interval estimates for the mean
To know how to determine the sample size necessary to obtain a desired confidence interval
OVERVIEW AND KEY CONCEPTS
Some Basic Concepts on Sampling Distribution


Why do we study sampling distribution?
 Sample statistics are used to estimate population parameters, but different sample
yields different estimate. The solution is to develop a theoretical basis on sampling
distribution.
What is a sampling distribution?
 A sampling distribution is a theoretical probability distribution of a sample statistic.
A sample statistic (e.g., sample mean, sample proportion) is a random variable
because a different sample will yield a different value for the statistic, and, hence, a
different estimate for the parameter of interest. The sampling distribution is the
probability distribution of the sample statistic as a result of taking all possible
samples of the same size from the population.
Sampling Distribution of the Sample Mean


Population mean of the sample mean
 X  
 This is the unbiased property of the sample mean.
Standard error (population standard deviation) of the sample mean



X 

n
Standard error of the sample mean is smaller than the standard deviation of the
population.
 The larger the sample size, the smaller the standard error.
The central limit theorem: As the sample size (i.e., the number of the observations in a
sample) gets large enough, the sampling distribution of the mean can be approximated by the
normal distribution regardless of the distribution of the individual values in the population.
2
Chapter 7: Sampling Distributions

The distribution of the sample mean
 If the population is normally distributed, the sampling distribution of the mean is
normally distributed regardless of the sample size.
 If the population distribution is fairly symmetrical, the sampling distribution of the
mean is approximately normal if sample size is at least 15.
 For most population distributions, regardless of the shape, the sampling distribution
of the mean is approximately normally distributed if the sample size is at least 30.
Why We Need Confidence Interval Estimates in Addition to Point Estimates



Confidence interval estimates take into consideration variation in sample statistics from
sample to sample.
They provide information about closeness to unknown population parameters.
The interval estimates are always stated in level of confidence, which is lower than 100%.
Confidence Interval Estimate for the Mean when the Population Variance is Known



Assumptions:
 Population variance  2 is known.
 Population is normally distributed or the sample size is large.
Point estimate for the population mean  : X
Confidence interval estimate:




X  Z / 2

n
where Z / 2 is the value corresponding to a cumulative area of
 
 1   from a standardized normal distribution, i.e., the right-tail probability of
2

.
 /2
Elements of confidence interval estimate
 Level of confidence: Measures the level of confidence in which the interval will
contain the unknown population parameter.
 Precision (range): Represents the closeness to the unknown parameter.
 Cost: The cost required to obtain a sample of size n.
Factors affecting interval width (precision)
 Data variation measured by  2 : The larger the  2 , the wider the interval
estimate.
 Sample size n: The larger the sample size, the narrower the interval estimate.
 The level of confidence 100 1    % : The higher the level of confidence, the
wider the interval estimate.
Interpretation of a 100 1    % confidence interval estimate: If all possible samples of
size n are taken and their sample means are computed, 100 1    % of the intervals contain
the true population mean somewhere within the interval around their sample means and only
100   % of them do not.
Study Guide and Student’s Solutions Manual
3
Confidence Interval Estimate for the Mean when the Population Variance is Unknown
 Assumptions:
 Population variance  2 is unknown.
 Population is normally distributed or the sample size is large.
 Confidence interval estimate:

X  t / 2,n 1
S
where t / 2,n 1 is the value corresponding to a cumulative area of
n
 
 1   from a Student’s distribution with n-1 degrees of freedom, i.e., the right-tail
2

probability of  / 2 .
Confidence Interval Estimate for the Proportion

Assumptions:
 Two categorical outcomes
 Population follows Binomial distribution
 Normal approximation can be used if np  5 and n 1  p   5 .

Point estimate for the population proportion of success p : pS

Confidence interval estimate:

pS  Z / 2
pS 1  pS 
n
Determining Sample Size

The sample size needed when estimating the population mean:


n
Z 2 2
where e is the acceptable sampling error and  2 is estimated from past
2
e
data, by an educated guess or by the data obtained from a pilot study.
The sample size needed when estimating the population proportion:

Z 2 p 1  p 
n
where p is estimated from past information, by an educated guess
e2
or use 0.5.
Download