Uploaded by larrytheturtleii

MATH200 Chapter 9 notes

advertisement
CHAPTER 9 – Estimating the Value of a
Parameter
Section 9.1 – Estimating a Population Parameter
A point estimate is the value of a statistic that estimates the value of a parameter.
đ‘Ĩ
For example, the point estimate for the population proportion is 𝑝Ė‚ = 𝑛 where x is the number
of individuals in the sample with a specified characteristic and n is the sample size.
Example: a Point Estimate of a Population Proportion
The Gallup Organization conducted a poll in which a simple random sample of 1015 Americans
18 and older were asked, “Do you consider the amount of federal income tax you have to pay is
too high?” Of the 1015 adult Americans surveyed, 458 said yes. Obtain a point estimate for the
proportion of Americans 18 and older who believe the amount of federal income tax they pay is
too high.
đ‘Ĩ
𝑝Ė‚ = 𝑛 =____________
Interpretation __________________________________________________________________
Construct and Interpret a Confidence Interval for the Population Proportion
In statistics, we construct an interval for a population parameter based on a guess along with a
level of confidence. The guess is the point estimate of the population parameter, and the level
of confidence plays a role in the width of the interval.
A confidence interval for an unknown parameter consists of an interval of numbers based on a
point estimate.
The level of confidence represents the expected proportion of intervals that will contain the
parameter if a large number of different samples is obtained. The level of confidence is
denoted (1 – α)·100%.
For example, a 95% level of confidence (α = ___________) implies that if 100 different
confidence intervals are constructed, each based on a different sample from the same
population, we will expect ____________ of the intervals to contain the parameter and 5 not to
include the parameter.
1
Review what we know about the model that describes the sampling distribution of 𝑝Ė‚ , the
sample proportion.
•
The shape of the distribution of all possible sample proportions is approximately normal
provided 𝑛𝑝(1 − 𝑝) ≥_______ the sample size is no more than _____% of the
population size, and the data are obtained randomly.
•
The mean of the distribution of the sample proportion equals the population
proportion. That is, 𝜇𝑝Ė‚ =_______
•
The standard deviation of the distribution of the sample proportion (the standard error)
is 𝜎𝑝Ė‚ =________
Because the distribution of the sample proportion is approximately normal, we know that 95%
of all sample proportions will lie within 1.96 standard deviations of the population proportion,
p, and 2.5% of the sample proportions will lie in each tail.
It is common to write confidence interval estimates for the population proportion as
point estimate ± margin of error
The margin of error for a 95% confidence interval for the population proportion is 1.96 ∙ 𝜎𝑝Ė‚
This determines the width of
the interval.
𝑝Ė‚ ± 1.96𝜎𝑝Ė‚
2
A 95% level of confidence means that 95% of all possible samples result in confidence
intervals that include the parameter (and 5% of all possible samples result in confidence
intervals that do not include the parameter).
(1 − đ›ŧ) ∙ 100% of all sample proportions will result in confidence intervals that contain the
population proportion. The sample proportions that are in the tails of the distribution in the
figure will not result in confidence intervals that contain the population proportion.
The value 𝑧đ›ŧ/2 is called the critical value of the distribution. It represents the number of
standard deviations the sample statistic can be from the parameter and still result in an interval
that includes the parameter.
Table 1 shows some of the
common critical values used
in the construction of
confidence intervals. Notice
that higher levels of
confidence correspond to
higher critical values.
3
Interpretation of a Confidence Interval
A (1 − đ›ŧ) ∙ 100% confidence interval indicates that (1 − đ›ŧ) ∙ 100% of all simple random samples of
size n from the population whose parameter is unknown will result in an interval that contains the
parameter.
Example: Interpreting a Confidence Interval
45.1% of national adults, aged 18 and older surveyed considered the amount of federal income tax they
have to pay as too high. Results are based on telephone interviews with a random sample of 1015.
There is a 95% confidence with a sampling error of 4 percentage points.
point estimate ± margin of error
The point estimate is __________, and the margin of error is ___________.
The confidence interval is _____________________________
Interpretation: We are 95% confident that___________________________________________
______________________________________________________________________________
Constructing a (𝟏 − đœļ) ∙ 𝟏𝟎𝟎% Confidence Interval for a Population Proportion
Suppose that a simple random sample of size n is taken from a population or the data are the
result of a randomized experiment. A (1 − đ›ŧ) ∙ 100%
confidence interval for p is
given by the following quantities
𝑝Ė‚(1−𝑝Ė‚)
Lower bound: 𝑝Ė‚ − 𝑍đ›ŧ/2 √
𝑛
𝑝Ė‚(1−𝑝Ė‚)
Upper bound: 𝑝Ė‚ + 𝑍đ›ŧ/2 √
𝑛
The margin of error, E, in a (1 − đ›ŧ) ∙ 100% confidence interval for a population proportion is
given by
𝐸 = 𝑍đ›ŧ/2 √
𝑝Ė‚ (1 − 𝑝Ė‚ )
𝑛
Example: Constructing a Confidence Interval for a Population Proportion
In the Parent–Teen Cell Phone Survey conducted by Princeton Survey Research Associates
International, 800 randomly sampled 16- to 17-year-olds living in the United States were asked
whether they have ever used their cell phone to text while driving. Of the 800 teenagers
surveyed, 272 indicated that they text while driving. Obtain a 95% confidence interval for the
proportion of 16- to 17-year-olds who text while driving.
4
đ‘Ĩ
𝑝Ė‚ = 𝑛 =____________
Check for normality:
•
•
Sample size is less than 5% of the population size and
𝑛𝑝(1 − 𝑝) = ____________________ ≥_______
Because we want a 95% confidence interval, we have
đ›ŧ = 1 − ________ = _______, so 𝑍đ›ŧ/2 = 𝑍0.05/2 = 𝑍0.025 =______________
The margin of error is _______________
We are 95% confident that the proportion of ________________________________________
is between ___________ and ____________.
StatCrunch Command____________________________________________________________
The Effect of Level of Confidence on the Margin of Error
We stated earlier that logic suggests that a higher level of confidence leads to a _____________
interval.
Example: Role of the Level of Confidence on the Margin of Error
For the problem of estimating the proportion of 16- to 17-year-old teenagers who text while
driving, determine the effect on the margin of error by increasing the level of confidence from
95% to 99%.
đ›ŧ = 1 − ________ = _______, so 𝑍đ›ŧ/2 = 𝑍0.01/2 = 𝑍0.005 =______________
The margin of error is _______________
We are 99% confident that the proportion of ________________________________________
is between ___________ and ____________.
The Effect of Sample Size on the Margin of Error
We know that larger sample sizes produce more precise estimates (the Law of Large Numbers).
𝑝Ė‚(1−𝑝Ė‚)
Given that the margin of error is 𝐸 = 𝑍đ›ŧ/2 √
𝑛
we can see that increasing the sample size n ___________________ the standard error; so the
margin of error decreases.
This means that larger sample sizes will result in _____________________ confidence intervals.
5
Determine the Sample Size Necessary for Estimating a Population Proportion within a Specified
Margin of Error
Sample size needed for a specified margin of error, E, and level of confidence (1 – α):
𝑍đ›ŧ/2 2
)
𝑛 = 𝑝Ė‚ (1 − 𝑝Ė‚ ) (
𝐸
Problem: The formula uses 𝑝Ė‚ which depends on n, the quantity we are trying to determine!
Two possible solutions:
1. Use an estimate of 𝑝Ė‚ based on a pilot study or an earlier study.
2. Let 𝑝Ė‚ = 0.5 which gives the largest possible value of n for a given level of confidence
and a given margin of error.
The sample size required to obtain a (1 – α)·100% confidence interval for p with a margin of
error E is given by
𝑍đ›ŧ/2 2
)
𝑛 = 𝑝Ė‚ (1 − 𝑝Ė‚ ) (
𝐸
(rounded up to the next integer), where 𝑝Ė‚ is a prior estimate of p.
If a prior estimate of p is unavailable, the sample size required is
𝑍đ›ŧ/2 2
)
𝑛 = 0.25 (
𝐸
rounded up to the next integer.
Example: Determining Sample Size
An economist wants to know if the proportion of the U.S. population who commutes to work
via carpooling is on the rise. What size sample should be obtained if the economist wants an
estimate within 2 percentage points of the true proportion with 90% confidence
6
1. if the economist uses the 2009 estimate of 10% obtained from the American
Community Survey?
𝑍đ›ŧ 2
𝑛 = 𝑝Ė‚ (1 − 𝑝Ė‚ ) ( 𝐸2 ) =____________________________________________
2. The economist does not use any prior estimates:
𝑍đ›ŧ/2 2
𝑛 = 0.25 (
𝐸
) =______________________________________________
The effect of not having a prior estimate of p is that the sample size more than doubled!
9.2 Estimating a Population Mean
A point estimate is the value of a statistic that estimates the value of a parameter.
The sample mean, đ‘ĨĖ… , is a point estimate of the population mean μ.
Example: Computing a Point Estimate of the Population Mean
The website fueleconomy.gov allows drivers to report the miles per gallon of their vehicle. The
data in Table 2 show the reported miles per gallon of 2014 Toyota
Camry automobiles for 16 different owners. Obtain a point estimate of the population mean
miles per gallon of a 2014 Toyota Camry.
The point estimate of ī­ is 28.1 miles per gallon.
Student’s t-Distribution
Suppose a simple random sample of size n is taken from a population that follows a normal
distribution. The distribution of
đ‘ĨĖ… − 𝜇
𝑡=𝑠
⁄ 𝑛
√
follows Student’s t-distribution with n – 1 degrees of freedom, where đ‘ĨĖ… is the sample mean and
s is the sample standard deviation.
Properties of the t-Distribution
7
The t-distribution is different for different degrees of freedom.
1. The t-distribution is centered at 0 and is symmetric about 0.
2. The area under the curve is 1. The area under the curve to the right of 0 equals the area
under the curve to the left of 0, which equals ½.
3. As t increases or decreases without bound, the graph approaches, but never equals,
zero.
4. The area in the tails of the t-distribution is a little greater than the area in the tails of the
standard normal distribution, because we are using s as an estimate of σ, thereby
introducing further variability into the t-statistic.
5. As the sample size n increases, the density curve of t gets closer to the standard normal
density curve. This result occurs because, as the sample size n increases, the values of s
get closer to the values of σ, by the Law of Large Numbers.
The notation zīĄ is used to represent the z-score whose area under the normal curve to the right
of zīĄ is īĄ. Similarly let tīĄ represent the t-value whose area under the t-distribution to the right
of tīĄ is īĄ.
8
Example: Finding t-Values
Find the t-value such that the area under the t distribution to the right of the t-value is 0.10,
assuming 15 degrees of freedom (df). That is, find t0.10 with 15 degrees of freedom.
The value of t0.10 with 15 d.f. is_____________________________
StatCrunch Command ___________________________________________________________
Constructing a (1 – α)100% Confidence Interval for μ
Provided
•
sample data come from a simple random sample or randomized experiment,
•
sample size is small relative to the population size (n ≤ 0.05N), and
•
the data come from a population that is normally distributed, or the sample size is large.
A (1 – α)·100% confidence interval for μ is given by
Lower bound: đ‘ĨĖ… − 𝑡đ›ŧ/2 ∙ 𝑠⁄
√𝑛
Upper bound: đ‘ĨĖ… + 𝑡đ›ŧ/2 ∙ 𝑠⁄
√𝑛
Where 𝑡đ›ŧ/2 is the critical value with n – 1 degrees of freedom. Because this confidence interval
uses the t-distribution, it is often referred to as a t-interval.
The margin of error for constructing confidence intervals about a population mean is
𝐸 = 𝑡đ›ŧ/2 ∙ 𝑠⁄
√𝑛
The Normality Condition
When Gosset developed the t-distribution, he assumed the sample data came from a
population that is normally distributed. Most distributions are not exactly normal, so we need
to verify that the sample data appear “normal enough” before using the t-distribution to
construct confidence intervals for the population mean.
9
Small Sample Size
Option 1 (The Better Option)
•
n < 30: Draw a normal probability plot to check the normality condition and a boxplot to
check for outliers. If the data appear to come from a population that is approximately
normal with no outliers, then it is reasonable to use Student’s t-distribution to construct
a confidence interval for a population mean.
Option 2 This option relies on the robustness of constructing confidence intervals using
Student’s t-distribution. An inferential method is robust if it is accurate despite minor
departures from its underlying assumptions (such as the assumption of normalityin Student’s tdistribution).
•
n < 15: Use Student’s t-distribution to construct confidence intervals about a population
mean if the sample data are symmetric with no outliers. The data should not be skewed
left or right. This condition may be verified with a boxplot. The median should be in the
middle of the box and the whiskers should be of equal length.
•
15 ī‚Ŗ n < 30: Use Student’s t-distribution to construct confidence intervals about a
population mean provided the sample data do not have “extreme” skewness and no
outliers.
Large Sample Size
If n ī‚ŗ 30, use Student’s t-distribution to construct confidence intervals about a population mean
even for skewed distributions. This relies on the Central Limit Theorem (for the distribution of
the sample mean to be approximately normal) and the Law of Large Numbers (for the sample
standard deviation to be close to the population standard deviation).
Example: Constructing a Confidence Interval about a Population Mean
The website fueleconomy.gov allows drivers to report the miles per gallon of their vehicle. The
data in Table 3 show the reported miles per gallon of 2014 Toyota Camry automobiles for 16
different owners. Treat the sample as a simple random sample of all 2014 Toyota Camry
automobiles. Construct a 95% confidence interval for the mean miles per gallon of a 2014
Toyota Camry. Interpret the interval.
10
StatCrunch Commands___________________________________________________________
The correlation between MPG and the expected z-scores is _______________
We are 95% confident that the mean ______________________________________________
is between ___________ and _____________ mpg.
Determining the Sample Size n
The sample size required to estimate the population mean, µ, with a level of confidence (1–
α)·100% with a specified margin of error, E, is given by
𝑛=(
𝑧đ›ŧ/2 ∙ 𝑠 2
)
𝐸
where n is rounded up to the nearest whole number.
Example: Determining Sample Size
We again consider the problem of estimating the miles per gallon of a 2014 Toyota Camry. How
large a sample is required to estimate the mean miles per gallon within 0.5 mile per gallon with
95% confidence?
Substitute the values of z, s, and E into the formula and obtain
𝑛=
Round up to ________. This sample size results in an interval estimate of the population mean
miles per gallon of a 2014 Toyota Camry with a margin of error of 0.5 mile per gallon with 95%
confidence.
11
12
Download