WHY SAMPLE - KSU Web Home

advertisement
1
UNIT TWO
Sampling; Sampling Distributions;
Interval Estimation; Hypothesis Testing
Why Sample?
Recall that inferential statistics as a body of knowledge refers to approaches for drawing conclusions
about populations based on samples drawn from those populations. Let’s say you desire to gain
information about a population. If you can access the entire population fairly quickly and cheaply,
and without undue harm, then you would likely opt to examine the population in its entirety and
forego sampling. For many populations, however, one would draw a sample from the population for
some combination of the following reasons:
(1) save time
(2) save money
(3) limit the destruction of entities in those situations where to measure the variable of interest, you
must destroy the entity being measured (sometimes referred to as destructive sampling)
(4) if the population comprises observations arising from a process (e.g., the widths of
semiconductor chips made using a particular process), the process can operate indefinitely thereby
yielding an unending (infinite) population of observations
(5) theory exists which enables us to draw conclusions--in which we can have reasonably high
confidence--about populations from samples
Types of Probability Samples
A probability sample is a sample of units drawn from a population in such a way--invariably involving
random selection--that one can draw (based on the sample) a statistically sound conclusion about the
population. Conclusions about populations based on non-probability samples (e.g., samples of
convenience or samples involving self-selection) are suspect. Several methods for obtaining probability
samples are described below.
Simple Random Sampling: A simple random sample of size n drawn from a finite population is a
sample of n units drawn in such a way that every sample of that same size has the same chance of being
selected. To draw a simple random sample from a finite population, you must be able to assign each
unit in the population a unique number (with the same number of digits as the numbers assigned to the
other units). A table of random digits is used to determine which units are to be included in the sample.
A simple random sample can be drawn with replacement (where each time you select a unit to go in the
sample, you "replace" the unit in the population, thus giving it a chance to be picked again) or without
replacement (where each time you select a unit to go in the sample, you "don't replace" the unit in the
population, thus don't give it a chance to be picked again).
2
Stratified Sampling: In stratified sampling, the population is divided into non-overlapping
subpopulations called strata which together comprise the entire population. Then a sample is drawn
from each stratum. If a simple random sample is drawn from each stratum, one has a stratified random
sample. Stratified random sampling is more precise (i.e., leads to less variable results from sample to
sample for any given sample size) than simple random sampling when the units within each stratum are
more homogeneous (similar to one another with respect to the variable of interest) than the population
as a whole. Stratified sampling permits you to gather information about the individual subpopulations
as well as the entire population. Example of stratified sampling: to estimate the mean cash holdings of
financial institutions in Georgia as of September 1, you randomly sample some small, some
medium-sized, and some large financial institutions in Georgia.
Cluster Sampling (single-stage). In cluster sampling, the population is divided into non-overlapping
subpopulations called clusters which together comprise the entire population. Then a simple random
sample of clusters is selected, and each unit in each selected cluster is examined. Cluster sampling is
appropriate when each cluster is deemed to be as heterogeneous (varied with respect to the variable of
interest) as the population as a whole. Often, populations naturally exist in clusters (e.g., people
clustered in neighborhoods, products clustered on shelves), allowing one to increase the sample size at
considerably less cost than with simple random sampling. Example of single-stage cluster sampling: in
a marketing research study to estimate the proportion of adults who feel a new cereal is highly nutritious
and the proportion who feel the new cereal has an excellent taste, a cereal manufacturer distributes
through the post office small boxes of the new cereal to all the homes in randomly selected zip codes
within a metropolitan area. (Self-addressed, stamped post cards are distributed with the cereal to
capture reactions to the cereal.)
Systematic Sampling. To draw a systematic sample, you must be able to number the units in the
population sequentially. To select a systematic sample, a unit is selected at random from the first k
units, and then every kth unit thereafter is selected. (A systematic sample is equivalent to cluster
sampling with 1 cluster being randomly selected.) Example of systematic sampling: to estimate the
mean level of customer satisfaction with the service rendered, the manager of a car dealership--after
randomly selecting the number 4 out of the counting numbers 1 through 10--interviews the 4th, 14th, 24th,
34th, etc. customers entering the dealership over a period of one week.
The Two Basic Approaches to Inferential Statistics
Interval Estimation
Recall that a parameter is a summary measure describing a population. Examples of parameters
include the mean of a population, the variance of a population, and the proportion of a population
falling in a certain category. One basic form of inferential statistics is to obtain--via sampling--an
interval estimate of (confidence interval for) some parameter of a population. For example, you could
select a random sample of 400 paid employees in the U.S. and—based on the proportion of that
sample of employees who changed jobs within the last year—get an interval estimate of what
proportion of ALL paid employees in the U.S. changed jobs within the last year. Interval estimation
can also be employed to obtain--via sampling--an interval estimate of (confidence interval for) the
difference or ratio between two parameters of two distinct populations.
3
Hypothesis Testing
Another basic form of inferential statistics is to choose--via sampling--between two competing
hypotheses about a population. It is customary to call the two competing hypotheses the null
hypothesis (typically denoted H0) and the alternative hypothesis (typically denoted H1), respectively.
For example, it could be that the mean fill amount of 2-liter bottles filled by a filling machine is the
intended 2.05 liters (H0:  = 2.05 liters). Alternatively, it could be that the mean fill amount is  2.05
liters (H1:   2.05 liters). You could select a random sample of 52 2-liter bottles filled by the
machine and—based on the fill amounts for that sample of bottles—assess whether H0 should be
called into question, i.e., assess whether H1 is supported. Hypothesis testing can also be employed to
choose--via sampling--between two competing hypotheses about multiple populations.
Sampling Distributions
Recall that a statistic is a summary measure describing a sample. Examples of statistics include the
mean of a sample, the variance of a sample, and the proportion of a sample falling in a certain
category. A sampling distribution is (by definition) the distribution of some statistic across all
samples of the same size and type that can be drawn from a population.
Following is an elaboration of that definition. Consider any population. Consider every possible
sample of a particular size and type that can be drawn from that population. Consider determining—
for EACH of those samples--the value of some statistic. The distribution of that statistic across all
the samples (i.e., the “pattern” in the collection of measurements, with one measurement per sample)
is called a sampling distribution.
Procedures for doing inferential statistics—whether interval estimation or hypothesis testing-rely on knowledge about sampling distributions. The theory about sampling distributions that we
will address in this course provides the rationale behind various procedures for doing inferential
statistics that we will learn in this course.
Estimators and Point Estimates
The type of statistic that one uses to estimate a population parameter is called an estimator. (For
example, X , the sample mean, is an estimator for , the population mean.) A specific value of an
estimator obtained from a specific sample is called a point estimate for the population parameter.
(For example, the mean of an actual sample drawn from a population would be a point estimate for ,
the population mean.)
Characteristics of a "good" estimator include:
(1)
being unbiased, which means that the mean of the estimator over all possible samples of the
same size and type equals the parameter being estimated.
(2)
being consistent, which means that as the sample size increases, the variance of the estimator
approaches 0.
(3)
being efficient, which means that the variance of the estimator (over all possible samples of a
given size and type) is small; estimator 1 is more efficient than estimator 2 if--across all
samples of the same size and type--the variance of estimator 1 is smaller than that of estimator
2.
4
The sample mean, sample variance (using the divisor n –1), and sample proportion (proportion of a
sample falling in a particular category) are unbiased and consistent estimators of the population mean,
population variance, and population proportion (proportion of the population falling in a particular
category), respectively.
Theory Underlying Inferential Statistics Procedures Addressed in This Unit
In the theorems below:
 X denotes a quantitative variable.
 The variable X denotes the mean of a sample of n observations of X.
 p denotes the proportion of a population falling in a certain category.
 The variable p denotes the proportion of a sample (of observations from a population) falling in a
certain category.
 The assumed sampling method is simple random sampling.
 E(Q) denotes the expected value (or mean) of the variable Q.
 VAR(Q) denotes the variance of the variable Q.
 STDEV(Q) denotes the standard deviation of the variable Q.
Theorem 1. For any quantitative variable X having mean  and variance 2: (a) E( X ) = ; and
(b) VAR( X ) = or  2/n and STDEV( X ) = or   n unless sampling is done without
Theorem 2.
 2  N n

 and STDEV( X ) =
n  N 1 
If X is normally distributed, then X is normally distributed.
replacement and n > .05N, in which case VAR( X ) =

n
N n
.
N 1
Theorem 3 (the CENTRAL LIMIT THEOREM). For X having a finite variance (and, implicitly, a
non-normal distribution): as n approaches infinity, the distribution of X approaches a normal
distribution. Associated rule of thumb: if n  30, then X has approximately a normal distribution.
X 
Theorem 4. If X is normally distributed, then t 
has the t-distribution with n-1 degrees of
S n
freedom (df). [note: the family of t-distributions is discussed below]
Theorem 5. E( p ) = p; and VAR( p ) = or  p(1 - p)/n and STDEV( p ) = or  p(1  p) n unless
sampling is done without replacement and n  .05N, in which case
p (1  p )  N  n 
p(1  p) N  n
VAR( p ) =

 and STDEV( p ) =
n
n
N 1
 N 1 
Another rule of thumb related to the Central Limit Theorem: For n so large that np  5 and
n(1 - p)  5 (from which it follows that 5/n  p  1 – (5/n)) p has approximately a normal distribution.
Theorem 6. If X1 is normally distributed with mean 1 and variance 12 and X2 is normally distributed
with mean 2 and variance 22, then across all independent selections of one observation of X1 and one
observation of X2, aX1 – bX2 (for real numbers a and b not both 0) is normally distributed with mean
a1 - b2 and variance a21 2 + b222 (from which it follows that X1 – X2 is normally distributed with
mean 1 - 2 and variance 12 + 22).
5
t distribution
There is a family of what are called t-distributions; each member of the family has a particular number
(v, where v is a positive integer) degrees of freedom. Every t-distribution in the family is roughly bellshaped, has a mean of 0, and has a standard deviation greater than 1. As the degrees of freedom
approaches infinity, the corresponding t-distribution approaches the standard normal distribution. (note:
A t-distribution with 29 or more degrees of freedom is very similar to the standard normal distribution.)
Four t-distributions are depicted in the figure below.
Figure.
Distributions depicted, from “tallest” to “shortest”:
Standard normal distribution (legend label “infinity”)
t distribution with 20 degrees of freedom (legend label “20”)
t distribution with 5 degrees of freedom (legend label “5”)
t distribution with 2 degrees of freedom (legend label “2”)
t distribution with 1 degree of freedom (legend label “1”)
6
Interval Estimates (Confidence Intervals)
A C% (e.g., 95%) confidence interval for a parameter is an interval obtained by a process
which--in advance of being applied--has a probability of C/100 (e.g., .95) of yielding an interval
containing the parameter. C% is called the confidence level, and C/100 is called the confidence
coefficient. Commonly used confidence levels are 90%, 95% and 99%; their associated confidence
coefficients are .90, .95, and .99, respectively. When we obtain a C% confidence interval for a
parameter, we say that "we are C% confident that the parameter is between the endpoints of the
interval." It would be technically incorrect to say the probability is C/100 that the parameter is in the
interval because, once you obtain a specific interval, the parameter is either in the interval or not in
the interval--there's no probability about it.
Many confidence intervals (interval estimates) for individual parameters are of the form:
point estimate ± (reliability factor)(standard error of estimator),
where:
(1)
the point estimate is a single-valued estimate of the parameter
(2)
the reliability factor depends on the desired confidence level,
(3)
the standard error of the estimator (which typically must be estimated) is the standard
deviation of the estimator over repeated samples of size n, and
(4)
the (reliability factor)(standard error of estimator) is called the margin of error.
There are two ways to decrease the width of a confidence interval: (1) take a larger sample; or (2)
decrease the confidence level.
Confidence intervals may be obtained for items other than individual population parameters. For
example, one may obtain a confidence interval for the difference between two population parameters
or obtain a confidence interval for the ratio between two population parameters.
See the formula sheet at the end of this unit packet for: (a) confidence interval formulas for 
(the mean of a population), 1 - 2 (the difference between two population means), p (the
proportion of a population falling in a certain category), and p1 – p2 (the difference between
two population proportions falling in a certain category); and (b) formulas for estimating how
large a sample to draw—prior to obtaining a confidence interval for a population mean or
proportion—in pursuit of pre-set confidence level and margin of error targets.
7
p-value Approach to Hypothesis Testing
The p-value approach to hypothesis testing comprises the following four steps:
(1) Indicate the two competing hypotheses about the population(s). The two hypotheses should be
complements of one another (i.e., be non-overlapping yet together cover all the possibilities).
The null hypothesis (H0) must be (or contain) a statement about the population(s) which permits
you to specify what the probability distribution of some sample statistic (which we call the test
statistic) would be like should that statement about the population(s) within the null hypothesis
be true.
(2) Decide upon an appropriate test statistic, and—subsequent to drawing sample(s) from the
population(s)—calculate the value of the test statistic for the sample(s) drawn. A test statistic is a
particular sample measure to be computed from your sample data.
(3) Calculate the p-value, which is the (maximum) probability, should H0 be true, of obtaining a test
statistic value as contrary to H0—or more contrary to H0—as the test statistic value you obtained
from the sample(s) drawn. The p-value is in essence telling you how rare (in probability terms) it
would be to obtain a sample(s) such as yours if H0 was true.
(4) State your conclusion. Standard conclusions associated with various p-values are:
p-value
p  .10
.05  p < .10
.01  p < .05
.005  p < .01
p < .005
standard conclusion
H0 may be true
marginal evidence that H1 is true
evidence that H1 is true
strong evidence that H1 is true
very strong evidence that H1 is true
To further clarify this chart, consider the following scenario. Under the assumption that some particular
H0 is true, you get a sample result so extreme (outlandish) that a result that extreme (or even more so)
would have only a .003 chance of happening if H0 was true. You can respond in one of two ways at this
point: (1) you reason that something unusual happened, “that’s all,” and you go along with H0 or (2)
you reason that, hey, if H0 was true, this shouldn’t have happened, so you question H0, and go along
with H1 instead. The standard practice is to respond in the latter fashion.
See the formula sheet at the end of this packet for specifications of test statistics for testing
hypotheses about  (a single population mean) or p (the proportion of a population falling in a
certain category.
8
Significance Level Approach to Hypothesis Testing
With the significance level approach to hypothesis testing, one decides (the decision can be made in
advance of obtaining any sample data) how inconsistent with H0 (in probability terms) the sample data
would need to be in order to reject H0. The significance level  is that measure (in probability terms) of
inconsistency with H0. If sample data that inconsistent (or more so) with H0 is obtained, H0 is rejected;
otherwise, H0 is not rejected (we will say accepted). The following chart shows the implications of
various choices of  :
Choice of 
.10
.05
.01
.005
Implications of that choice
H0 will be rejected if and only if marginal evidence contrary to H0 is found
H0 will be rejected if and only if evidence contrary to H0 is found
H0 will be rejected if and only if strong evidence contrary to H0 is found
H0 will be rejected if and only if very strong evidence contrary to H0 is found
The significance level is the maximum probability of rejecting H0 should H0 be true.
We will be taking the p-value approach to hypothesis testing. By reporting the p-value, we are
conveying just how inconsistent with H0 the sample data turned out to be. A person desiring a
particular significance level  would reject H0 if and only if the p-value (often referred to as the
observed significance level) is  .
Type I and Type II Errors
Whenever you test a particular null hypothesis using the significance level approach, only one (but you
don’t know which one) of the following two errors is possible:
Type I error: you reject a true H0 (i.e., in essence, you conclude H1 is true when in reality H0 is true)
Type II error: you accept a false H0 (i.e., in essence, you conclude H0 is true when in reality H1 is true)
In essence, the larger the sample size you use, the less likely it is to make whichever of the two errors is
possible for the hypothesis test at hand.
, the significance level of the test (which as you recall is chosen by the person conducting the test), is
the maximum probability of making a Type I error. Thus one can readily limit--regardless of the sample
size used--the probability of making a Type I error to one’s chosen . Placing as well a limit on the
probability () of making a Type II error requires you to draw an adequately large sample; for a variety
of hypothesis testing procedures, procedures (outside the scope of this course) exist for determining
what sample size is adequate.
One test procedure is said to be more powerful (or have higher power) than another if the former has—
for any given same sample size and any given significance level—a lower chance of making a Type II
error than the latter.
9
Practice Problems for Test #2
note: Use the p-value approach to hypothesis testing for all hypothesis testing problems.
1. Suggest four reasons for sampling from a population instead of examining the population in its
entirety.
2. The personnel director of a large company wishes to estimate the mean number of hours worked per
week in secondary employment by its full-time employees. How many full-time employees should be
sampled in order to estimate this mean number to within 1 hour with 95% confidence? For a
preliminary sample of full-time employees, the variance in the number of hours worked per week in
secondary employment was 25.
3. A furniture manufacturer wishes to estimate the percentage of consumers who would prefer sofas with
zip-on, zip-off washable fabric covers over the current offerings of the manufacturer. How many
consumers should be sampled so that this percentage can be estimated to within 5 percentage points
(.05) with 99% confidence? The manufacturer estimates the true percentage to be .30.
4. A restaurant manager wishes to assess the effectiveness of a new menu design on the sale of
appetizers. On a particular evening, 100 customers are given the new menu and 150 customers are
given the standard menu. Treat their resulting expenditures on appetizers as independent random
samples. The mean expenditure on appetizers for customers given the new menu was $5.70 with a
standard deviation of $.44. The mean expenditure on appetizers for customers given the standard
menu was $5.58 with a standard deviation of $.40. Determine a 95% confidence interval for the
difference in the mean expenditure on appetizers using the new menu and that using the standard
menu. Interpret the interval.
5. Out of a random sample of 800 regular listeners of a certain radio station, 600 were teenagers.
Construct a 95% confidence interval for the percentage of all regular listeners of the radio station who
are teenagers. Interpret the interval.
6. A manufacturer wished to compare the percentage of defective components made by two suppliers, A
and B. Out of a random sample of 125 components made by supplier A, 7 were defective; out of an
independent random sample of 100 components made by supplier B, 7 were defective. Determine a
95% confidence interval for the difference in the percentage of defective supplier A components and
the percentage of defective supplier B components. Interpret the interval.
7. Imagine that you wish to draw a simple random sample of 10 job descriptions without replacement
from the 200 job descriptions within a company. Assume the 200 job descriptions are maintained in a
file and are numbered from 1 to 200. Use the table of random numbers on page 259 of our text to
select the sample. There are many correct "answers." The particular sample you get will depend on
where you start in the table and how you move through the table. Determine the sample you would
get if you started with the 6th column of numbers, went down that and successive columns, and chose
to look at the 3 leftmost digits in each block of 5 digits.
8. Out of a random sample of 150 employees of a large company, 22 indicated that they planned to take
an early retirement. Does this sample suggest that more than 10% of all the company's employees
plan to take an early retirement? Would a person desiring to use a significance level of .01 reject the
null hypothesis?
10
9. An investigator for a congressional committee wished to determine a 96% confidence interval for the
mean number of hours commercial truck drivers drive in one day. For a random sample of 150
commercial truck drivers, the mean number of hours spent driving the previous day (as determined
from the truckers' log books) was 11.3 hours with a standard deviation of 2.7 hours. Construct the
desired 96% confidence interval. Interpret the interval.
10. A jewelry manufacturer wants a 95% confidence interval for the difference in the mean lifetime of
two varieties of drill tips. For a random sample of 42 drill tips of variety one, the mean lifetime was
14.3 hours with a standard deviation of .8 hours. For an independent random sample of 45 drill tips
of variety two, the mean lifetime was 16.8 hours with a standard deviation of .7 hours. Determine,
and interpret, the desired confidence interval.
11. Assume that the mean number of meals single adults in metro Atlanta ate out last week was 6.8
meals, with a standard deviation of 1.3 meals. What is the probability of getting, for a simple random
sample of 70 single adults, a mean less than or equal to 6.4 meals?
12. A mail order company’s standard shipping fee is based on the assumption of a mean package weight
of 20 ounces. A random sample of 40 packages sent out recently had a mean weight of 21.3 ounces
with a standard deviation of 3.7 ounces. Does this sample suggest that the mean weight of the
packages sent out by this company is not 20 ounces? Would a person desiring to use a significance
level of .10 reject the null hypothesis?
Answers:
1. save time; save money; statistical theory exists that permits us to estimate features of populations from
samples with considerable precision and confidence; to limit the destruction of entities (because
sometimes to measure the variable of interest, you must destroy the entity being measured)
z 2 (estimate of  2 )
2. n 
= (1.96)2(25)/12 = 97 (rounded up) employees
2
E
2
z (estimate of p)(1  estimate of p)
3. n 
= (2.575)2(.30)(.70)/(.05)2 = 557 consumers
2
E
4. x1  x 2  z s1 / n1  s2 /n2 = ($5.70 - $5.58)  1.96SQRT[(.44)2/100 + (.40)2/150] = ($.01,$.23).
We are 95% confident that the mean expenditure on appetizers by customers given the new menu
would be between 1 and 23 cents higher than the mean expenditure by customers given the standard
menu.
2
2
5. p  z p(1  p) / n = .75  1.96SQRT[(.75)(.25)/800] = (.72,.78). We are 95% confident that
between 72% and 78% of all regular listeners are teenagers.
6. p 1  p 2  z
p 1 (1  p 1 ) p 2 (1  p 2 )

=
n1
n2
(.056 – .07)  1.96SQRT[(.056)(.944)/125 + (.07)(.93)/100] = (-.078, .05). We are 95% confident
that the percentage of defective supplier A components is between 7.8 percentage points less and 5
percentage points more than the percentage of defective supplier B components. (Thus no
significant difference between the two percentages was found.)
7. The sample will comprise the job descriptions numbered 010, 185, 024, 155, 200, 127, 034, 104, 169,
and 092.
11
8. H0: p  .10
H1: p  .10
p  p0
= (.147 - .10)/SQRT[(.10)(.90)/150)] = 1.92
z
p0 (1  p0 ) / n
p = P(Z  1.92) = .5 - .4726 = .0274
Evidence that more than 10% of the company’s employees plan to take an early retirement.
At  = .01, do not reject H0. (because p = .0274 is not < .01)
9. x  z s n = 11.3  2.052.7/SQRT(150) = (10.8, 11.8). We are 96% confident that the mean
number of hours driven that day by all commercial truck drivers was between 10.8 and 11.8 hours.


10. x1  x 2  z s1 / n1  s2 /n2 = (14.3 – 16.8)  1.96SQRT(.64/42 + .49/45) = -2.5  .3 =
(-2.8, -2.2).We are 95% confident that the mean lifetime of the drill tips of variety one is between 2.2
and 2.8 hours less than the mean lifetime of the drill tips of variety two.
11. n  30  X  normally distributed. X has mean 6.8 and standard deviation 1.3/SQRT(70) =
.1554. Thus, answering the question will involve converting 6.4 to the z-scale and consulting the
z-table. P( X  6.4) = P(Z  -2.57) = .5 - .4949 = .0051.
12. H0:  = 20 ounces
H1:   20 ounces
2
z
2
x  0
= (21.3 – 20)/[3.7/SQRT[(40)] = 2.22
s n
p = 2P(Z  2.22) = 2(.0132) = .0264
Evidence that the mean weight of the packages being shipped is not 20 ounces.
At  = .10, one would reject H0 (because p = .0264 < .10).
More Practice Problems for Test #2
note: Use the p-value approach to hypothesis testing for all hypothesis testing problems.
1. A marketing consultant studied the buying habits of customers on behalf of a large retail department
store; in so doing, customer gender was one factor of interest. Suppose that for a simple random
sample of 350 female customers, the mean time spent in a recent visit to the store was 53 minutes
with a standard deviation of 32 minutes, and that for an independent simple random sample of 250
male customers, the mean time spent in a recent visit to the store was 24 minutes with a standard
deviation of 11 minutes. Determine, and interpret, a 95% confidence interval for the difference
between the mean time spent by females and the mean time spent by males in a recent visit to the
store. Note: This problem is adapted from Groebner et al (2001).
2. A production manager believes that a new procedure for assembling leather sofas would result in a
mean assembly time per sofa of less than 20 minutes. The procedure is put in place on a trial basis.
16 sofas are made with the new procedure. (Treat these 16 sofas as a simple random sample of
sofas that could be made with the new procedure.) The mean assembly time for this sample was
19.2 minutes, with a standard deviation of 1.2 minutes. Is the manager’s belief supported?
(Assume that the assembly times using the new procedure would be approximately normally
distributed.) At  = .05, would the null hypothesis be rejected?
3. State the Central Limit Theorem.
4. An automobile manufacturer wishes to estimate the mean miles per gallon (mpg) under normal
driving conditions of a new car model. (a) If the manufacturer wants to estimate the mean mpg to
within .1 miles with 99% confidence, how many cars should be sampled (and then tested)? The
standard deviation in the mpg amounts for this model of car is estimated to be .25 mpg. (b) 40 cars
12
were driven under normal driving conditions. Their mean mpg was 41.75 mpg with a standard
deviation of .24 mpg. Determine, and interpret, a 99% confidence interval for the mean mpg of all
cars of this model.
5. A marketing manager wants an estimate of the percentage of newspaper subscribers in a
metropolitan area who recall the slogan sent in an advertising supplement to the Monday morning
newspaper. (a) If it is desired to estimate the true percentage to within .04 with 95% confidence,
how many subscribers should be queried? It is believed that the true percentage is no greater than
40%. (b) A random sample of 500 subscribers was surveyed. 185 of these subscribers recalled the
slogan. Determine, and interpret, a 95% confidence interval for the percentage of all subscribers
receiving the advertising supplement who would recall the slogan.
6. Assume that 4% (that is, a proportion of .04) of all fans made by a particular process are defective.
What is the probability that, for a simple random sample of 800 fans made by that process, 5% or
more (that is, a proportion of .05 or more) are defective?
7. Identify four characteristics true of all t-distributions.
8. A manufacturer has designed a new variety of heat-sensing device. A quality control engineer
wishes to assess the mean operating temperature of devices of this kind. (The operating
temperature is here defined as the temperature at which an alarm on the device is activated.) The
operating temperatures are believed to be approximately normally distributed. 20 of these new
devices were tested under similar, tightly controlled conditions. Their mean operating temperature
was 158.7 degrees Fahrenheit, with a standard deviation of 8.4 degrees Fahrenheit. Determine, and
interpret, a 95% confidence interval for the mean operating temperature of devices of this kind.
9. A tax return preparation company wants to compare the quality of work at two of its regional
offices. Independent random samples of 250 returns completed at Office A and 300 returns
completed at Office B were examined. 35 of the 250 Office A returns contained at least one error,
and 27 of the 300 Office B returns contained at least one error. Determine, and interpret, a 95%
confidence interval for the difference between the percentage of all Office A returns with at least
one error and the percentage of all Office B returns with at least one error.
10. A researcher wished to assess whether there are gender differences in tolerance for ambiguity. She
decided to measure tolerance for ambiguity on a 1 to 50 point scale using a particular test
instrument. For a random sample of 35 men, the mean tolerance for ambiguity score was 34.6
points with a standard deviation of 3.6 points. For an independent random sample of 40 women,
the mean tolerance for ambiguity score was 33.2 points with a standard deviation of 4.1 points.
Determine, and interpret, a 90% confidence interval for the difference in the mean tolerance for
ambiguity scores of men and women.
11. For a random sample of 36 customers at a particular branch office of a bank last month, the mean
waiting time for service was 133 seconds with a standard deviation of 26 seconds. Does this
sample data provide evidence to conclude that the mean waiting time for service at the branch
office last month was greater than 120 seconds (2 minutes)? Would H0 be rejected at the .01
significance level?
Answers:
1. x1  x 2  z s1 / n1  s2 /n2 = (25,33) We are 95% confident that the mean time spent in a recent
visit to the store by females was between 25 and 33 minutes greater than the mean time spent by
males (or that females spent, on average, between 25 and 33 minutes longer in a recent visit to the
store than males).
2. H0:   20 minutes
H1:  < 20 minutes
2
t
x  0
s
n
2
= (19.2 – 20)/[1.2/SQRT(16)] =-2.67 [use 15 df]
13
p = P(t  -2.67); .005 < p < .01
Strong evidence that the mean assembly time with the new procedure would be less than 20 minutes.
Because p < .05, H0 would be rejected at  = .05.
3. For a non-normally distributed population of quantitative observations having a finite variance, as the
sample size approaches infinity, the distribution of the sample mean approaches a normal
distribution.
z 2 (estimate of  2 )
4. (a) n 
= 42 cars; (b) x  z s n = (41.65, 41.85) We are 99% confident that
E2
the mean mpg under normal driving conditions of cars of this model is between 41.65 and 41.85
mpg.
z 2 (estimate of p)(1  estimate of p)
5. (a) n 
= 577 subscribers;
E2


(b) p  z p(1  p) / n = (.33,.41) We are 95% confident that between 33% and 41% of all
subscribers receiving the ad would recall the slogan.
6. np = (800)(.04) = 32  5 and n(1-p) = 800(.96) = 768  5  p is approximately normally
distributed. p has mean .04 and standard deviation SQRT[(.04)(.96)/800] = .00693.
Thus,
answering the question will involve converting .05 to the z-scale and consulting the z-table. P( p 
.05) = P(Z  1.44) = .5 - .4251 = .0749.
7. Roughly bell-shaped; mean = 0; standard deviation > 1; as the degrees of freedom approaches infinity,
the corresponding t-distribution approaches the standard normal distribution.


8. x  t s n = (154.8,162.6) We are 95% confident that the mean operating temperature of devices
of this variety is between 154.8 and 162.6 degrees F
9. p 1  p 2  z
p 1 (1  p 1 ) p 2 (1  p 2 )

= (-.004, .104) We are 95% confident that the percentage of
n1
n2
Office A returns with at least one error is between .4 percentage points less and 10.4 percentage points
more than the percentage of Office B returns with at least one error. (Thus, no significant difference
between the two percentages was found.)
10. x1  x 2  z s1 / n1  s2 /n2 = (-.1, 2.9). We are 90% confident that the mean tolerance for
ambiguity score of men is between .1 points less and 2.9 points greater than that of women. (Thus, no
significant difference between the means was found.)
11. H0:   120 seconds
H1:  > 120 seconds
2
z
x  0
2
= (133 – 120)/[26/SQRT(36)] = 3.00
s n
p = P(z  3.00) = .5 - .4987 = .0013 < .005
Very strong evidence that the mean waiting time last month was greater than 120 seconds.
H0 would be rejected at the .01 significance level (because p < .01).
14
Sampling Distribution Homework
1. If the mean price of one gallon of unleaded gasoline across all gas stations in the United States is
currently $1.35, and the standard deviation in price is $.15, what would be the probability of
getting, for a random sample of 100 gas stations (a) a mean price greater than or equal to $1.36; (b)
a mean price greater than or equal to $1.39?
2. If the current mean annual salary of U.S. federal employees (in the United States) is $41,500, with a
standard deviation of $5000, what would be the probability of getting, for a random sample of 64
employees: (a) a mean salary less than or equal to $40,500? (b) mean salary less than or equal to
$39,800?
3. If the lifetimes of batteries of a particular variety are approximately normally distributed with a
mean of 14.5 hours and a standard deviation of 2 hours, what would be the probability of getting,
for a random sample of 16 batteries, a mean lifetime of  13.3 hours or  15.7 hours?
4. Assume that 15% of all adults in the U.S. are self-employed. What would be the probability of
getting, in a random sample of 500 adults, 18% or fewer self-employed?
5. If the proportion of overweight children in the U.S. is .70, what would be the probability of getting,
in a random sample of 1000 children, 74% or more that are overweight?
answers:
1. n  30  X  normally distributed. X has mean $1.35 and standard deviation $.15/SQRT(100) =
$.015. (a) P( X  1.36) = P(Z  .67) = .5 - .2486 = .2514 (b) P( X  1.39) = P(Z  2.67) = .5 - .4962
= .0038.
2. n  30  X  normally distributed. X has mean $41,500 and standard deviation $5000/SQRT(64) =
$625. (a) P( X  40,500) = P(Z  -1.60) = .5 - .4452 = .0548. (b) P( X  39,800) = P(Z  -2.72) =
.5 - .4967 = .0033.
3. X  normally distributed  X  normally distributed. X has mean 14.5 hours and standard
deviation 2/SQRT(16) = .5. Thus, P( X  13.3 or X  15.7) = P( X  13.3) + P( X  15.7) =
P(Z  -2.40) + P(Z  2.40) = .0082 + .0082 = .0164.
4. np  5 and n(1-p)  5  p  normally distributed. p has mean .15 and standard deviation
SQRT[(.15)(.85)/500] = .01597. Thus, P( p  .18) = P(z  1.88) = .5 + .4699 = .9699.
5. np  5 and n(1-p)  5  p  normally distributed. p has mean .70 and standard deviation
SQRT[(.70)(.30)/1000] = .01449. Thus, P( p  .74) = P(z  2.76) = .5 - .4971 = .0029.
15
Answers to Homework Problems from Text (arranged by Chapter)
Chapter 7
3. 459, 147, 385, 113, 340, 401, 215, 002, 033, 348
6. 2782, 0493, 0825, 1807, 0289
13. (a) x = 93 units is the point estimate of  (b) s = 5.4 units is the point estimate of 
15. (a) x = 7.0 years is the point estimate of  (b) s = 1.5 years is the point estimate of 
16. p = 1117/1400 = .80 is the point estimate of p
17. In each case, the sample proportion p is the estimate of the true proportion p. Thus the point
estimates are: (a) 595/1008 = .59 (b) 332/1008 = .33 (c) 81/1008 = .08
Chapter 8




10. x  z s n = 7.75  (1.96)(3.45/SQRT(180)) = 7.75  .50 = (7.25, 8.25) We are 95% confident
that the mean household TV viewing time from 8-11 pm was (over the time period in question) between
7.25 and 8.25 hours per week.
11. x  z s n = 6.34  (1.96)(2.16/SQRT(50)) = 6.34  .60 = (5.74, 6.94). We are 95% confident that
the mean rating of all business travelers would be between 5.74 and 6.94.

12. (a) x = 3.8 minutes is the point estimate (b) margin of error = z s



n = (1.96)(2.3/SQRT(30)) = .8
(c) x  z s n = 3.8  (1.96)(2.3/SQRT(30)) = 3.8  .8 = (3.0, 4.6). We are 95% confident that the
mean drive-through time when ordering a basic meal is between 3.0 and 4.6 minutes.


17. 90% CI for : x  t s n = 80  (1.740)(10/SQRT(18)) = 80  4 = (76, 84). We are 90% confident
that the mean production of employees using the new method would be between 76 and 84 parts per hour.
95% CI for : x  t s n = 80  (2.110)(10/SQRT(18)) = 80  5 = (75, 85). We are 95% confident that
the mean production of employees using the new method would be between 75 and 85 parts per hour.




19. The point estimate is x = 6.5 minutes. x  t s n = 6.5  (2.093)(.5/SQRT(20)) = 6.5  .2 =
(6.3, 6.7). We are 95% confident that the mean number of non-program minutes on half-hour, prime-time
television shows at 8:30 p.m. is between 6.3 and 6.7 minutes.


21. x  t s n = 108  (2.365)(10/SQRT(8)) = 108  8 = (100,116). We are 95% confident that the
mean prescription cost for Zocor is between $100 and $116.
27. (a) n = (1.96)2(2000)2/(500)2 = 62 such college graduates (b) n = (1.96)2(2000)2/(200)2 = 385 such
college graduates (c) n = (1.96)2(2000)2/(100)2 = 1537 such college graduates
29. (a) n = (1.96)2(6.25)2/(2)2 = 38 residents (b) n = (1.96)2(6.25)2/(1)2 = 151 residents
16
36. (a) p = 152/346 = .439 (b) p  z p(1  p) / n = .439  1.96SQRT[(.439)(.561)/346] = .439  .052
= (.387,.491). We are 95% confident that between 39 and 49 percent of job seekers would select “higher
compensation elsewhere” as a reason for changing jobs.
41. p  z p(1  p) / n = .16  1.96SQRT[(.16)(.84)/1285] = .16  .02 = (.14,.18). We are 95%
confident that between 14 and 18 percent of U.S. consumers used the Internet to buy gifts during the 1999
holiday season. (The margin of error is .02.)
51. n = (1.96)2(8)2/(2)2 = 62 patient visits; n = (2.576)2(8)2/(2)2 = 107 patient visits
54. (a) p
= 200/369 = .542 (b)
z p (1  p ) / n
= 1.96SQRT[(.542)(.458)/369] = .051 (c)
p  z p(1  p) / n = .542 ± .051 = (.491,.593). We are 95% confident that (in 1995) between 49%
and 59% of working parents would say they spend too little time with their children because of work
commitments.
56. (a) n = (2.33)2(.70)/(.30)/(.03)2 = 1267 credit cardholders (b) n = (2.33)2(.50)((.50)/(.03)2 = 1509
credit cardholders
59. (a) n = (1.96)2(.30)(.70)/(.02)2 = 2017 people 18 or older
60.(a) p = .31 (b) p  z p(1  p) / n = .31  1.96SQRT[(.31)(.69)/1993] = .31  .02 = (.29,.33). We
are 95% confident that between 29% and 33% of all business travelers would say a frequent flyer program
is the most important factor when choosing an airline carrier. (c) n = (1.96)2(.31)((.69)/(.01)2 = 8218
business travelers; no, because not worth the extra survey costs
Chapter 9
5. (a) Type I error: Concluding the mean newspaper reading time for individuals in management
positions is greater than the national average of 8.6 minutes when in fact it is less than or equal to 8.6
minutes. A negative consequence: Management is cast in a more favorable light than it deserves.
(b) Type II error: Concluding the mean daily newspaper reading time for individuals in management
positions is less than or equal to the national average of 8.6 minutes when in fact it is greater than 8.6
minutes. A negative consequence: Management is cast in a worse light than it deserves.
6 (a) H0:   1 g
H1:  > 1 g
(b) Type I error: Conclude the mean amount of fat in the containers is greater than 1 g when in fact it
is less than or equal to 1 g. A negative consequence: Unfairly damage the manufacturer’s reputation.
(c) Type II error: Conclude the mean amount of fat in the containers is less than or equal to 1 g
when in fact it is greater than 1 g. A negative consequence: Consumers will not be able to accurately
control their fat intake.
7 (a) H0:   $8000
H1:  > $8000
(b) Type I error: Conclude the compensation plan will increase mean sales above the current level of
$8000 when in fact it won’t. A negative consequence: Waste time and money implementing the
compensation plan.
17
(c) Type II error: Conclude the compensation plan will not increase mean sales above the current
level of $8000 (i.e., conclude   $8000) when in fact it will. A negative consequence: Lose out on
the opportunity to implement a plan that will increase mean sales.
8 (a) H0:   $220
H1:  < $220
(b) Type I error: Conclude the mean operating cost will be below the current mean of $220 when it
won’t. A negative consequence: Waste time and money implementing a new production method that
won’t reduce costs (and could even increase costs).
(c) Type II error: Conclude the mean operating cost will be at least as great as the current mean of
$220 when in reality it will be lower. A negative consequence: Lose the opportunity to reduce
operating costs.
calculated
problem
Ho and H1
value of
p-value
conclusion
test statistic
16
Ho:   13 hpm
H1:  < 13 hpm
z = -2.88
.002
17
Ho:   15 min.
H1:  > 15 min.
z = 2.96
.0015
19
Ho:   $181,900
H1:  < $181,900
z = -2.93
.0017
25
Ho:  = 39.2 hr .
H1:   39.2 hr.
z = -1.54
2(.0618)
= .1236
30
Ho:  = $26,133
H1:   $26,133
z = -2.09
2(.0183)
= .0366
38
Ho:  = $90
H1:   $90
t = -1.90
41
Ho:   280 yd
H1:  > 280 yd
t = 2.07
2(.025) < p < 2(.05)
.05 < p < .10
[24 df]
.025 < p < .05
[8 df]
42
Ho:   2 hours
H1:  > 2 hours
t = 2.53
[9 df]
.01 < p < .025
48
Ho: p  .50
H1: p > .50
z = 3.13
.0009
50
Ho: p = .50
H1: p  .50
z = 2.83
2(.0023)
= .0046
very strong evidence
that  < 13 hpm
[at  = .01, reject H0]
very strong evidence
that  > 15 min.
[at  = .01, reject H0]
very strong evidence
that  < $181,900
[at  = .01, reject H0]
conclude that  may
be 39.2 hr.
[at  = .05, don’t reject H0]
evidence that
  $26,133
[at  = .05, reject H0]
marginal evidence
that   $90
[at  = .05, don’t reject H0]
evidence that
 > 280 yd
[at  = .05, reject H0]
evidence that
 > 2 hours
[at  = .05, reject H0]
very strong evidence
that p > .50
[at  = .01, reject H0]
very strong evidence
that p  .50
[at  = .01, reject H0]
18
59
Ho:  = 120 bars
H1:   120 bars
t = -.71
p > .20
[9 df]
66
Ho: p = .78
H1: p  .78
z = 2.17
2(.015)
= .03
68
Ho: p  .47
H1: p < .47
z = -2.64
.0041
conclude that 
may be 120 bars
[at  = .05, don’t reject H0]
evidence that
p  .78
[at  = .05, reject H0]
very strong evidence
that p < .47
[at  = .01, reject H0]
Chapter 10
5. (a)
x1  x 2
= 22.5 - 18.6 = 3.9 miles is the point estimate of 1 - 2
(b)
x1  x 2  z s1 / n1  s2 /n2 = 22.5 - 18.6  1.96SQRT[(8.4)2/50 + (7.4)2/100] = 3.9  2.7 = (1.2, 6.6).
We are 95% confident that the mean number of miles traveled per day in a car by Buffalo residents is
between 1.2 and 6.6 miles greater than the mean traveled by Boston residents.
2
2
6. x1  x 2  z s1 / n1  s2 /n2 = 6.34 - 6.72  1.96SQRT[(2.16)2/50 + (2.37)2/50] =
-.38  .89 = (-1.27, .51). We are 95% confident that the mean rating business travelers would give the
Miami airport is between 1.27 points less and .51 points more than the mean rating they would give the
Los Angeles airport. (so we do not have evidence of a difference in the mean ratings)
2
2
7.(a) 14.9 - 10.3 = 4.6 yr
(c) x1  x 2  z s1 / n1  s2 /n2 = 14.9 - 10.3  1.96SQRT[(5.2)2/100 + (3.8)2/85] = 4.6  1.3 = (3.3,
5.9). We are 95% confident that the mean number of years of work experience of men is between 3.3 and
5.9 years greater than that of women.
2
2
Chapter 11
3. p1 - p 2  z ( p1 (1  p1 ) / n1  p 2 (1  p 2 )/n2 =
.55 - .48  1.96SQRT[(.55)(.45)/400 + (.48)(.52)/400] = .07  .07 = (.00, .14). We are 95% confident
that the percentage of senior executives at large corporations thinking the number of full-time employees
would increase at their companies over the next 12 months was between 0 and 14 percentage points
greater in May 1997 than it was in December 1996.
8b. p1 - p 2  z ( p1 (1  p1 ) / n1  p 2 (1  p 2 )/n2 =
.42 - .30  1.96SQRT[(.42)(.58)/150 + (.30)(.70)/200] = .12  .10 = (.02, .22). We are 95% confident
that the percentage of individuals seeing commercial A who would remember the primary message is
between 2 and 22 percentage points greater than the percentage of individuals seeing commercial B who
would remember it.
31b. p1 - p 2  z ( p1 (1  p1 ) / n1  p 2 (1  p 2 )/n2 =
.19 - .10  1.96SQRT[(.19)(.81)/400 + (.10)(.90)/900] = .09  .04 = (.05, .13). We are 95% confident
that the percentage of single policyholders who made an insurance claim over the past three years is
between 5 and 13 percentage points greater than the percentage of married policyholders who did so.
19
Unit Two Formula Sheet
note: This formula sheet assumes simple random sampling and that every sample size employed is less
than 5% of the population size.


(1) X has mean  and standard deviation  and (2) X is normally distributed or n  30 X is
normally distributed (or approximately so) with mean  and standard deviation (standard error)
 n
np  5 and n(1-p)  5  p  normally distributed with mean p and standard deviation (standard
error)
Parameter



1-2
p
p1-p2
p(1  p) n
Situation
Confidence Interval for Parameter
n  30.
 unknown.
n1  30. n2  30.
independent
samples drawn.
1 and 2
unknown.
np  5.
n(1-p)  5.
n1p1  5.
n1(1-p1)  5.
n2p2  5.
n2(1-p2)  5.
independent
samples drawn.

n

the true standard error  n .
This is justified by the large
sample size.

n

Reference the t-distribution with
n-1 degrees of freedom.

It is rare to know  when you
don’t even know .
xzs
X normally
distributed.
 unknown.
 known.
X normally
distributed or
n  30.
Comment
s n is serving as an estimate of
xt s

xz / n
x1  x 2  z s1 / n1  s2 /n2
2
2
p  z p(1  p) / n
p1  p 2  z
Within each confidence interval,
the square root is serving as an
estimate of the true standard
error. This is justified by the
large sample size(s).
p 1 (1  p 1 ) p 2 (1  p 2 )

n1
n2
Parameter for which Sample size needed to achieve pre-set
confidence interval
confidence level and margin of error (E)
sought
targets

z 2 (estimate of  2 )
n
E2
p
z 2 (estimate of p)(1  estimate of p)
n
E2
Comment
If calculated n < 30,
actually sample 30.
If no estimate of p is
available, substitute .50
for estimate of p.
20
p-value approach to testing a hypothesis about
a population mean  or a population proportion p
Step 1. State the two competing hypotheses. [note: 0 denotes a constant; p0 denotes a constant ]
Parameter
µ
p
Two competing hypotheses
set-up A set-up B set-up C
H0:   0 H0:   0 H0:  = 0
H1:  < 0 H1:  > 0 H1:   0
H0: p  p0 H0: p  p0 H0: p = p0
H1: p < 0 H1: p > p0 H1: p  p0
Step 2. Choose an appropriate test statistic, and calculate the value of the test statistic for the
sample obtained.
Parameter Situation
Test statistic
x  0
z
s n
µ
n  30.
 unknown.
µ
X normally distributed.
 unknown.
t
µ
 known.
X normally distributed or n  30.
z
p
np  5 and n(1-p)  5.
z
x  0
s
n
[n - 1 d.f.]
x  0

n
p  p0
p0 (1  p0 ) / n
Step 3. Determine the p-value.
For set-up A in step 1: p = P(test statistic  calculated value of test statistic GIVEN H0 is true)
For set-up B in step 1: p = P(test statistic  calculated value of test statistic GIVEN H0 is true)
For set-up C in step 1: p = 2P(test statistic  calculated value of test statistic GIVEN H0 is true)
Step 4. Draw your conclusion. [Standard interpretations of p-values are given below.]
p < .005
.005  p < .01 .01  p < .05
.05  p < .10
p  .10
very strong evidence strong evidence
evidence
marginal evidence
H0 may be true
that H1 is true
that H1 is true that H1 is true
that H1 is true
_________
note: In any hypothesis testing situation, reject H0 at significance level  if and only if the p-value is <  .
Download