Chapter 9 - Estimation Using a Single Sample

advertisement
Chapter 9 - Estimation Using a Single Sample
The objective of inferential statistics is to use sample data to estimate some unknown
characteristic of the corresponding population, such as mean (μ) or proportion (π). Two
methods of accomplishing this are point estimation and interval estimation.
9.1 Point Estimation
Point Estimate – of a population characteristic is a single number that is based on a sample
data and represents a plausible value of the characteristic.

 When estimating a population proportion, there is only one way to do it, p .
 When estimating a population mean, you can use the sample mean ( X ), the sample
median, or a trimmed mean. When deciding which to choose, we want the one that
yields an accurate estimation and we use information from the sampling distribution.
Unbiased statistic – a statistic whose mean value is equal to the value of the population
characteristic being estimated.
 When given a choice of several unbiased statistics, use the one with the least
standard deviation.
9.2 Large-Sample Confidence Interval for a Population Proportion
Confidence Interval (CI) - for a population proportion, is a range of plausible values for the
unknown population characteristic. It is constructed using sample data, so that, with a chosen
degree of confidence (confidence level), the value of the characteristic will be captured
between the lower and upper endpoints of the interval. i.e. statistic ± margin of error.
Confidence Level – is the success rate of the method used to construct the confidence interval
Note: this is a confidence in the method used to construct the interval, not a confidence in the
specific interval.
 Usual choices for confidence levels are:
 90%, which has a tail area of .05, and a z* of 1.645
 95%, which has a tail area of .025, and a z* of 1.96
 99%, which has a tail area of .005, and a z* of 2.576
Ex. If the method was used to generate an interval over and over with different samples, a
95% confidence level would mean that, 95% of the resulting intervals would capture the
true value of the characteristic being estimated.
 As the confidence level increases, the confidence interval gets wider, and the precision
decreases.
Methods for Constructing Confidence Intervals
Large Sample CI for population proportion π – When n is sufficiently large, the statistic p is
unbiased, and has a sampling distribution that is approximately normal, with mean π and
 1 
 1 
standard deviation
then the CI =  ± z*
n
n
Conditions:
 Simple Random Sample
 Approximately Normal – np ≥ 10 and n(1 – p) ≥ 10
 n < 10% of population
For example:
For a 95% CI, (find .975 in the table of standard normal (z) curves and we have a z* value of
1.96,) so 95% of the values are within ± 1.96 standard deviations of the mean or π is within
 1 
 1 
 1 
the interval p – 1.96
to p +1.96
or p ±1.96
n
n
n
For a 99% CI, you would find .995 in the table of standard normal curves and z* = 2.58.
For a 90% CI, you would find .95 in the table of standard normal curves and z* = 1.645.
Standard Error of a statistic is the standard deviation of the statistic.
Choosing a sample size
If the sampling distribution is approximately normal, then the bound on error of estimation
(B) is 1.96 · standard error of the statistic for a 95% confidence interval.
The sample size (n) required to estimate a population proportion π to within an amount B with
95% confidence is
 1.96 
n   1   

  
2
9.3 Confidence Interval for a Population Mean
The general formula for a confidence interval for a population mean () when
o X is the sample mean from a random sample
o Sampling distribution is approximately normal (given, graph, n ≥ 30)
o The population standard deviation () is known
  
Is
X ± z* 

 n
If  is unknown, we must use the sample data to estimate  and the result is a different
X 
standardized variable denoted by t: t 
which has more variability and we must look
s
n
at t distributions.
t distributions – are distinguished by the number of degrees of freedom (df) and have the
properties:
o The t distribution corresponding to any fixed number of degrees of freedom is bell
shaped and centered at zero (just like the standard normal (z) distribution).
o Each t distribution is more spread out than the standard normal (z) distribution.
o As the number of degrees of freedom increases, the spread of the t distribution
decreases.
o As the number of degrees of freedom increases, the corresponding sequence of
t distributions approaches the standard normal (z) distribution.
The probability distribution of the standardized variable t 
X 
is the t distribution with
s
n
degrees of freedom = n – 1.
 s 
X   t critical value  

 n
where: X is the mean of a random sample, the population is normally distributed or n ≥ 30,
and  is unknown.
A One-Sample t Confidence Interval for  has formula:
The sample size required to estimate a population mean () to within an amount B (bound on
 1.96 
error of estimation) with 95% confidence is n  
 . If  is unknown, it may be
 B 
estimated based on previous information or, for a population that is not “too skewed” by using
range/4.
If the desired confidence level is not 95%, then replace 1.96 with the appropriate z* value.
2
General rule of thumb:
If n < 30, might have more variability and skewness
If 15 < n < 30, can use a t-test if there are no extreme outliers
If n < 15, include “proceed with caution”, distribution must be symmetrical to use t-test
9.4 Interpreting and Communicating the Results of Statistical Analyses
Interpretation of Confidence Interval: We can be 90% confident that the true proportion is
between ___ and ___.
Interpretation of Confidence Level: We have used a method to produce this estimate that
successfully captures the true proportion 90% of the time.
A wide confidence interval indicates that we don’t have very precise information about the
population characteristic being estimated. The width of a confidence interval is affected by
the confidence level, the sample size, and the standard deviation of the statistic used. The best
strategy for decreasing the width of a confidence interval is to take a larger sample.
Additional Notes
Statement about a Confidence Interval
We are ____% confident that the true mean/proportion of context lies within the interval _____ and ______.
To make the margin of error smaller:
 Make z* smaller by lowering the confidence level
 Increase the sample size n (to cut the margin of error in half, n must be 4x larger.)
 Make σ smaller (can’t really change this, but can use a different statistic that has a
smaller σ.)
Download