Statistics 2014, Fall 2001

advertisement
1
Chapter 7 – Inferences Concerning a Mean
Definition: Inferential statistics is the branch of statistics concerned
with inferring the characteristics of populations (i.e., parameter
values) based on the information contained in sample data sets.
Inferential statistics includes estimation of parameters and
hypothesis testing.
There are two branches of statistical inference, 1) estimation of
parameters and 2) testing hypotheses about the values of parameters.
We will consider estimation first.
Defn: A point estimate of a parameter  is a specific numerical
value, ˆ , of a statistic, ̂ , based on the data obtained from a sample.
A particular statistic, such as ̂ , used to provide a point estimate of a
parameter is called an estimator.
Note: ̂ is a function of X1, X2, ..., Xn, the elements of a random
sample, and is thus a random variable, with its own sampling
distribution.
Note: Nothing is said in the definition about the “goodness” of the
estimate. Any statistic is an estimate of a parameter. For a
particular parameter , some statistics provide better estimates than
others. We need to examine the statistical properties of each
estimator to decide which one gives the best estimate for a
parameter.
2
Defn: The point estimator ̂ is said to be an unbiased estimator of
ˆ
the parameter  if E    . If the estimator is not unbiased, then
ˆ
the difference B  E     is called the bias of the estimator ̂ .
Example: Assume that we have a r.s. X1, X2, ..., Xn from a
distribution with unknown mean . We want to find an unbiased
estimator of . From the linearity property of expectation, we have
1 n
1 n
 1 n
E  X   E   X i    E  X i       . Hence for any
n i 1
 n i 1  n i 1
distribution, the sample mean is an unbiased estimator of the
distribution mean.
Sometimes there are several unbiased estimators of a given
parameter. For example, each member of the sample is also an
unbiased estimator of the distribution mean. We want to decide
which of them provides the best estimate, based on the
characteristics of the sampling distributions of the estimators.
Suppose that we have selected a r.s. X1, X2, ..., Xn from a
distribution that is characterized by an unknown parameter .
Suppose that we have two statistics ̂1 and ̂2 , both of which are
unbiased estimators of . Which estimator should we use?
Since we want our particular estimate to be close to the true value of
, we want to use an estimator whose sampling distribution has
small variance.
3
Defn: In the class of all unbiased estimators of a parameter , that
estimator whose sampling distribution has the smallest variance is
called the minimum variance unbiased estimator (MVUE) for .
Defn: The standard error of an estimator ̂ of a parameter  is just
the square root of the variance of the sampling distribution of ̂ :
 
 
ˆ  V 
ˆ .
S .E. 
Example: Estimating the population mean from sample data.
Parameter: Population mean, µ
Data: A random sample, 𝑋1 , 𝑋2 , … , 𝑋𝑛
Unbiased and Efficient Estimator: 𝑋̅
𝑆
Estimator of Standard Error: 𝑛
√
In any given estimation situation, we have no way of knowing how
close the point estimate is to the true value of the parameter, since
the true value of the parameter is unknown. We can be certain,
however, that the estimated value is not equal to the true value.
We want to be able to say how good our estimate is. Hence, we
want to extend the idea of a point estimate to the following type of
estimation.
Defn: A confidence interval estimate of a parameter is an interval
obtained based on a point estimate, together with a percentage that
specifies how confident we are that the true value of the parameter
lies in the interval. This percentage is called the confidence level, or
confidence coefficient.
4
The general procedure for obtaining a confidence interval estimate
for a parameter  is as follows:
1) We choose our level of confidence, 1- , (usually 90% or 95%
or 99%).
2) We find statistics L and U such that P  L    U   1   .
Interpreting a Confidence Interval: We say that we are 1   100%
confident that the true value of the parameter lies in the interval.
This means that the interval was obtained by a method such that
1   100% of all intervals so obtained actually contain the true
parameter value.
Confidence interval for :
We choose our confidence level to be 1 - . Then we can write the
statement


X 
P  t  
t 
 n 1, 2  S  n 1, 2



 n



  1
.



Rearranging, we obtain

S
S 
P X t 
   X t 
  1 
n 1,
n

1,
n
n

2
2
Then
X t
n 1,

2
S
n is a 1   100% confidence interval for .
5
Example 1: A machine produces metal rods used in an automobile
suspension system. A random sample of 12 rods is selected, and the
diameter of each is measured, resulting in the following data
(measurements are in millimeters):
8.23
8.31
8.42
8.29
8.19
8.24
8.19
8.29
8.30
8.14
8.32
8.40
We want a 95% confidence interval estimate for µ, the mean
diameter of all metal rods produced by the machine.
We must first calculate the sample mean and sample standard
deviation using the Descriptive Statistics function of Excel. Then
we will find the confidence interval using the CONFIDENCE.T
function of Excel. Caution: This Excel function gives the margin of
error only, given the confidence level, sample standard deviation,
and sample size. We must then find the confidence interval using
the previously calculated sample mean
Example 2: Corrosion of reinforcing steel is a serious problem in
concrete structures located in environments affected by severe
weather conditions. For this reason, researchers have been
investigating the use of reinforcing bars made of composite material.
One study was carried out to develop guidelines for bonding glassfiber-reinforced plastic rebars to concrete (“Design
recommendations for bond of GFRP rebars to concrete,” Journal of
Structural Engineering, 1996: 247-254). Consider the following 48
observations on measured bond strength:
11.5
12.1
9.9
9.3
7.8
6.2
6.6
7.0
13.4
17.1
9.3
5.6
5.7
5.4
5.2
5.1
4.9
10.7
15.2
8.5
4.2
4.0
3.9
3.8
3.6
3.4
20.6
25.5
13.8
12.6
13.1
8.9
8.2
10.7
14.2
7.6
5.2
5.5
5.1
5.0
5.2
4.8
6
4.1
3.8
3.7
3.6
3.6
3.6
We want to use this data to obtain a 99% confidence interval
estimate for µ, the overall mean bond strength.
Sample Size for a Specified Margin of Error:
As part of our experimental design, we want to specify the margin of
error, E, that is acceptable for our estimate of , and choose a
sample size to insure that we achieve this margin of error. We let
E  z
2
 z

n 2
.
Solving
for
n,
we
obtain
 E
n


2

 2
.



Now, we know E and , but we need to find a usable value for 2
before we can find the sample size. Since we don’t know the value
of the sample variance until we collect the data, we have to go to
another source for a usable value of 2. Often, we do a literature
search for previous published research on the same topic. We then
use the sample variance from the previous research. Then the above
equation will give us a sample size for achieving the desired margin
of error with the desired level of confidence.
Example: Suppose that we are interested in the burning rate of a
solid propellant used to power aircrew escape systems; burning rate
is a random variable that can be described by a probability
distribution. Our interest focuses on the mean burning rate. We
want a 95% confidence interval estimate, and we want the margin of
error of our estimate to be no more than E = 1.5 cm./sec. Previous
studies have shown that the best estimate of the standard deviation
of the burning rate is σ = 2.0 cm./sec. We then need to find the
7
value of 𝑧𝛼 = 𝑧0.025 = 𝑁𝑜𝑟𝑚. 𝐼𝑁𝑉(0.975,0,1) = 1.96. Substituting
2
into the above formula, we find that the required sample size for
estimating µ with 95% confidence that our margin of error will be
no more than 1.5 cm./sec. is
1.96 2
𝑛=(
) 2.02 = 6.8295 ≅ 7.
1.5
Note that we always round up to obtain the required sample size.
Hypothesis Testing
Often, instead of estimating the value of a parameter based on
sample data, we simply want to decide whether we believe a specific
assertion about the value of the parameter.
Definition: An hypothesis is a statement about the value of a
population parameter.
Examples:
1) Nothing outlasts the Energizer.
2) More doctors recommend Tylenol for the relief of headache pain
than any other pain reliever.
Hypotheses are tested in pairs, to decide which of the two statements
is more believable.
The Null Hypothesis, H0
This hypothesis usually represents the state of no change or no
difference, from the researcher’s point of view. Often the null
hypothesis is a statement of current belief about the value of the
parameter; the researcher doubts the null hypothesis and wants to
disprove it. Symbolically, this hypothesis is never a strict
inequality.
8
The Alternative Hypothesis, Ha
This statement is what the researcher is attempting to prove. It
usually is the negation of the null hypothesis. Symbolically, this
hypothesis is always a strict inequality. This hypothesis can take
one of three forms for a parameter  and a given number 0:
1) Ha:  >  0
2) Ha:  <  0
3) Ha:    0
Examples: Let pT be the proportion of all doctors who recommend
Tylenol, and let pA be the proportion of all doctors who recommend
alternatives to Tylenol. We want to test the following two
hypotheses against each other:
H0: pT  pA
vs.
Ha: pT > pA
Whenever incomplete information, such as that from a sample, is
used to make an inference about the value of a population parameter,
there is the risk of making a mistake. In a hypothesis testing
situation, there are two possible mistakes that could be made.
Type I Error
This type of error occurs when our test leads us to reject the null
hypothesis when, in fact, the null hypothesis is true. The probability
of making a Type I error is denoted by the Greek letter , and is
called the significance level of the test. In scientific research, a
Type I error is usually considered to be more serious. This error is
made when the researcher concludes that she has proved what she
wanted to prove, but this conclusion is mistaken.
Type II Error
This type of error occurs when our test leads us to fail to reject the
null hypothesis when, in fact, the null hypothesis is false. The
probability of making a Type II error is denoted by the Greek letter
9
. This error is made when the researcher concludes that the data do
not give sufficient evidence to support the researchers original
conjecture, but in fact the conjecture is true. Later research may
then provide sufficient evidence to validate the researcher’s
conjecture.
Possible Results of a Hypothesis Test
Reject H0
Fail to Reject H0
H0 True
Type I Error

Ha True

Type II Error
Before conducting a hypothesis test, the researcher decides on an
acceptable level of risk for committing a Type I error (i.e., chooses a
value for ). Most commonly,  = 0.05. If a Type I error is deemed
to have more serious consequences, then the researcher may choose
a smaller value for , such as 0.01 or 0.001. The researcher also
chooses the amount of evidence to collect (sample size or sizes).
Test Statistic
The researcher summarizes the information contained in the simple
random sample(s) in the form of a test statistic, a random variable
whose value is calculated from the sample data. The value of this
statistic will tell the researcher whether to reject H0 or to fail to
reject H0. The test statistic must be chosen so that its probability
distribution under the null hypothesis is known.
Rejection Region( or Critical Region)
The rejection region is that range of possible values of the test
statistic such that, if the actual value falls in this region, the
10
researcher will reject H0 and conclude that Ha is true. The boundary
point(s) of this region is(are) called the critical value(s) of the test.
The form of the rejection region depends on the form of the
alternative hypothesis:
1) If the alternative hypothesis has the form Ha:  >  0, then the
rejection region is a right-hand tail of the distribution of the test
statistic, with area .
2) If the alternative hypothesis has the form Ha:  <  0, then the
rejection region is a left-hand tail of the distribution of the test
statistic, with area .
3) If the alternative hypothesis has the form Ha:    0, then the
rejection region is the union of a right-hand tail and a left-hand tail
of the distribution of the test statistic, each having area /2.
Note: You may be wondering why we don’t simply always choose
a very small value for the significance level of the test, so that we
will have a very slim chance of making a Type I error. The reason
is that, for a given level of evidence (sample size(s)), if we make the
probability of a Type I error smaller, we will automatically increase
the probability of making a Type II error. Our goal is to make both
probabilities as small as possible. Usually, the consequences of
making a Type I error are more serious than the consequences of
making a Type II error. Hence, we want to control . It is
important, however, to make both  and  as small as possible. We
do this by choosing an appropriate sample size(s). For a given
chosen value of , if we make our sample size(s) larger, we will
decrease the probability of making a Type II error.
11
Steps in Statistical Hypothesis Testing
The following steps must appear in each statistical hypothesis test.
The first four steps are the set-up of the test. These steps are
completed before the researcher chooses a sample(s) and collects
data.
Step 1: State the null hypothesis, H0, and the alternative hypothesis,
Ha. The alternative hypothesis represents what the researcher is
trying to prove. The null hypothesis represents the negation of what
the researcher is trying to prove. (In a criminal trial in the American
justice system, the null hypothesis is that the defendant is innocent;
the alternative is that the defendant is guilty; either the jury rejects
the null hypothesis if they find that the prosecution has presented
convincing evidence, or the jury fails to reject the null hypothesis if
they find that the prosecution has not presented convincing
evidence).
Step 2: State the size(s) of the sample(s). This represents the
amount of evidence that is being used to make a decision. State the
significance level, , for the test. The significance level is the
probability of making a Type I error. A Type I error is a decision in
favor of the alternative hypothesis when, in fact, the null hypothesis
is true. A Type II error is a decision to fail to reject the null
hypothesis when, in fact, the null hypothesis is false.
Step 3: State the test statistic that will be used to conduct the
hypothesis test. The following statement should appear in this step:
“The test statistic is _________ , which under H0 has a
_____________ probability distribution (with ____ degrees of
freedom).”
12
Step 4: Find the critical value for the test. This is found using the
T.INV function of Excel. This value represents the cutoff point for
the test statistic. If the value of the test statistic computed from the
sample data is beyond the critical value, the decision will be made to
reject the null hypothesis in favor of the alternative hypothesis.
Step 5: Calculate the value of the test statistic, using the sample
data. We find the sample mean and standard deviation using
Excel’s Descriptive Statistics, and then calculate t.
Step 6: Decide, based on a comparison of the calculated value of
the test statistic and the critical value of the test, whether to reject
the null hypothesis in favor of the alternative.
If the decision is to reject H0, the statement of the conclusion should
read as follows: “We reject H0 at the (value of ) level of
significance. There is sufficient evidence to conclude that
(statement of the alternative hypothesis).”
If the decision is to fail to reject H0, the statement of the conclusion
should read as follows: “We fail to reject H0 at the (value of )
level of significance. There is not sufficient evidence to conclude
that (statement of the alternative hypothesis).”
13
Example 1: A machine produces metal rods used in an automobile
suspension system. The manufacturing specifications say that the
mean diameter of the rods should be 8.20 mm. The quality control
engineer wants to test for conformity to specifications. He will
select a random sample of 12 of the rods from a large production run
and use the data from the sample to test whether the mean diameter
of the rods differs from the specified value. He will use  = 0.05 as
the significance level of the test.
14
Testing Hypotheses Concerning a Population Mean, :
We want to test hypotheses of the following possible forms:
1) H0:  = 0 vs. Ha:   0
2) H0:   0 vs. Ha:  < 0
3) H0:   0 vs. Ha:  > 0
X  0
 S  . Under the null
The test statistic to be used is


 n
hypothesis, the this statistic has an approximate t distribution with
d.f. = n-1.
T
For the three types of alternative hypotheses, the rejection regions
are:
| T | t
1) Ha:  ≠ 0
Reject H0 if
2) Ha:  < 0
Reject H0 if T  tn 1,
3) Ha:  > 0
Reject H0 if
n 1,

2
T  tn 1,
Example 1: The data for the automobile suspension rods are given
again below. We will use this data set to perform the hypothesis
test.
8.23
8.31
8.42
8.29
8.19
8.24
8.19
8.29
8.30
8.14
8.32
8.40
15
Example 2: Corrosion of reinforcing steel is a serious problem in
concrete structures located in environments affected by severe
weather conditions. For this reason, researchers have been
investigating the use of reinforcing bars made of composite material.
One study was carried out to develop guidelines for bonding glassfiber-reinforced plastic rebars to concrete (“Design
recommendations for bond of GFRP rebars to concrete,” Journal of
Structural Engineering, 1996: 247-254). Consider the following 48
observations on measured bond strength (all measurements in MPa):
11.5
12.1
9.9
9.3
7.8
6.2
6.6
7.0
13.4
17.1
9.3
5.6
5.7
5.4
5.2
5.1
4.9
10.7
15.2
8.5
4.2
4.0
3.9
3.8
3.6
3.4
20.6
25.5
13.8
12.6
13.1
8.9
8.2
10.7
14.2
7.6
5.2
5.5
5.1
5.0
5.2
4.8
4.1
3.8
3.7
3.6
3.6
3.6
It is desirable that the mean bond strength exceed 6.5 MPa. We
want to use the sample data to test whether  > 6.5 MPa.
Download