Uploaded by wex

Unit 7a Hypothesis Testing Single Sample

advertisement
STAT 210
Probability and Statistics
Unit 7a:
Tests of hypotheses for a single
sample
Outline




Introduction
Testing of one mean
Testing and Errors
Testing of one proportion
STAT210: Probability and Statistics
2
An Example: Example 5.2 from the text





Microdrills example in Chapter 5:We studied
the average lifetimes.
A sample of 50 microdrills had a mean of 12.68
and standard deviation of 6.83.
The population mean lifetime () of all
microdrills is unknown
Let us assume that the main question is
whether or not the population mean lifetime 
is greater than 11.
We can address this by conducting a
hypothesis test
STAT210: Probability and Statistics
3
Example




We see that our sample mean is larger than
11, but because of uncertainty in the means,
this does not guarantee that  > 11.
We would like to know just how certain we can
be that  > 11.
The statement “ > 11” is a hypothesis about
the population mean .
To determine just how certain we can be that a
hypothesis is true, we must perform a
hypothesis test.
STAT210: Probability and Statistics
4
Introduction

There are two common types of formal statistical
inference:
1)
2)
Confidence intervals (Interval Estimation)
 They are appropriate when our goal is to estimate a
population parameter.
Hypothesis Testing
 To assess the evidence (support) provided by the
sample data in favor of some claim about the
population.
 We perform a test of hypothesis only when we are
making a decision about a population parameter
based on the value of the sample statistic.
STAT210: Probability and Statistics
5
More Examples



A computer system currently has 10 terminals and
uses a single printer. The average turnaround time for
the system is 15 minutes. 10 new terminals and a
second printer are added to the system . Has the
mean turnaround time been improved?
Suppose a manufacturer claims their VHS tapes can
hold 120 minutes of programming at SP mode. You
believe they are shorter.
The manufacturer of the ColorSmart-5000 television
set claims that 95% of its sets last at least five years
without needing a single repair.
STAT210: Probability and Statistics
6
Hypotheses


A statistical hypothesis is a statement about the
parameters of one or more populations.
A null hypothesis H0 is a statement about a
population parameter that is assumed to be true until
it is declared false.
◼


The test is designed to assess the strength of the
evidence against the null hypothesis.
The alternative hypothesis Ha (or H1), a claim about
a population parameter that will be true if the null
hypothesis is false.
In performing a hypothesis test, we essentially
put the null hypothesis on trial.
STAT210: Probability and Statistics
7
More on Alternative Hypothesis

The alternative hypothesis H1 will contain either a
greater than sign (one-tailed test), a less than sign
(one tailed test), or a not equal to sign (two-tailed
test).
◼ Greater than (>): results if the problem says
increases, improves, better, result is higher, etc.
◼ Less than (<): results if the problem says decreases,
reduces, worse than, result is lower, etc.
◼ Not equal to (): results if the problem says different
from, no longer the same, changes, etc.
STAT210: Probability and Statistics
8
Procedure



We begin by assuming that H0 is true,
The random sample provides the evidence.
The hypothesis test involves measuring the
strength of the disagreement between the
sample and H0 to produce a number between 0
and 1, called a P-value.
STAT210: Probability and Statistics
9
Test Statistic and P-value





The test statistic is some quantity calculated from the
sample data that we have collected. It is used to
determine the strength of the evidence against H0.
Based on the distribution of the test statistic we
compute P-value
The smaller the P-value, the stronger the evidence is
against H0.
If the P-value is sufficiently small, we reject the
assumption that H0 is true and believe H1 instead.
This is referred to as rejecting the null hypothesis.
STAT210: Probability and Statistics
10
P-value
We take one final step to assess the
evidence against H0. We compare the Pvalue with a fixed value, called the
significance level (). Typical values of 
used are 0.05 and 0.01.
◼ If, for a test, P-value is less than , we
reject the null hypothesis H0 and conclude
that there is enough evidence to believe in
H1.
◼
STAT210: Probability and Statistics
11
Decision Summary

When we carry out the test we assume H0 is true.
Hence the test will result in one of two decisions.
◼
◼
◼
Reject H0: Hence we have sufficient evidence to
conclude that the alternative hypothesis is true. Such a
test is said to be significant.
Fail to reject H0: Hence we do not have sufficient
evidence to conclude that the alternative hypothesis is
true. Such a test is said to be insignificant.
We reject H0 if P-value < .
STAT210: Probability and Statistics
12
General Testing Procedure
1) State the null and alternative hypothesis.
2) Carry out the experiment, collect the data, verify the
assumptions, and compute the value of the test
statistic.
3) Calculate the p-value or the rejection region.
4) Make a decision on the significance of the test (reject
or fail to reject H0). Make a conclusion statement in the
words of the original problem.
STAT210: Probability and Statistics
15
Tests about Population Mean


Goal: We hypothesize that the population mean
() equals some value 0, and state the alternative
hypothesis that we wish to prove is true.
Case I: Normal Population with known 
◼
We use one-sample z-test.
ത 0
𝑋−𝜇
𝜎/ 𝑛
◼
𝑧=
◼
Not a realistic case.
STAT210: Probability and Statistics
16
Tests about Population Mean
Case II: Used when  is unknown

◼
◼
◼
◼
◼
For large sample sizes (n>30)
Or for Small sample sizes (n<30) and
Population
Normal
If the value of standard deviation  is unknown,
then instead of using (z) we should use (t) test
statistic.
For small sample we must verify Normality before
applying this procedure. We use the normal
probability plot to test for normality test.
The test statistic follows t distribution with
degrees of freedom df=n-1.
STAT210: Probability and Statistics
17
What we do in practice?

In real application  is not known, and we use
the one sample t-test
◼
◼
If sample size is small (n<30), we need to check for
normality.
If sample size is large (n>=30), No need for
normality check
STAT210: Probability and Statistics
18
One-sample t-test
1.
2.
Hypotheses
◼
H1:  > 0
Test Statistic
, H1:  < 0
t0 =
3.
P-value/ Critical region
H1
 > 0
 < 0
 ≠ 0
P-value
P(Tn-1>t0)
P(Tn-1<t0)
2P(Tn-1>|t0|)
or H1:   0
x − 0
s/ n
Critical Region
t0>t, n-1
t0<-t, n-1
|t0|>t/2, n-1
STAT210: Probability and Statistics
19
In Minitab
STAT210: Probability and Statistics
20
Example
1.
The life in hours of a battery is known to be
approximately normally distributed. A random
sample of 10 batteries has a mean life of 40.5 hours
and a standard deviation of 1.25 hours. Is there
evidence to support the claim that battery life
exceeds 40 hours? Use =0.05.
Hypotheses: H0:  <= 40 vs. H1:  > 40
2.
Test statistic (df=10-1=9)

t0 =
x − 0
s/ n
=
40.5 − 40
1.25 / 10
STAT210: Probability and Statistics
= 1.26
21
Example
3.
4.

P-value =P(T9 > 1.26)= 1-P(T9 < 1.26)=0.1197.
Since p-value=0.1197>0.05, we do not reject H0.
There is no sufficient evidence to support the claim
that battery life exceeds 40 hours.
Minitab output using one-sample t-test:
One-Sample T
Test of mu = 40 vs > 40
N
10
Mean
40.5000
StDev
1.2500
SE Mean
0.3953
95%
Lower
Bound
39.7754
STAT210: Probability and Statistics
T
1.26
P
0.119
22
Exercises
1) Before a substance can be deemed safe for
landfilling, its chemical properties must be
characterized.
An article reports that in a
sample of six sludge specimens from a New
Hampshire wastewater treatment plant, the
mean pH was 6.68 with a standard deviation of
0.20. Can we conclude that the mean pH is less
than 7.0?
STAT210: Probability and Statistics
23
Exercises
2) Ford Motor Company wants to test a new type of
engine to determine whether it meets new airpollution standards. The mean emission of all engines
of this type must be less than 20 parts per million
(ppm) of carbon. Ten engines are manufactured for
testing purposes and the emission level of each is
determined. The data in ppm is listed below.
15.6 16.2 22.5 20.5 16.4 19.4 16.6 17.9 12.7 13.9
At =0.01, do the data supply sufficient evidence to
allow Ford to conclude that this type of engine meets
the pollution standard?
STAT210: Probability and Statistics
24
Confidence Intervals and Hypothesis Testing


In hypothesis testing, another concept of interest is
the relationship between hypothesis testing and
confidence intervals.
Assume that the same significance level is used in the
hypothesis-testing situation and in finding the
confidence interval,
◼
When the confidence interval contains the hypothesized
mean, do not reject the null hypothesis.
◼
When the confidence interval does not contain the
hypothesized mean, reject the null hypothesis.
STAT210: Probability and Statistics
25
Example

Sugar is packed in 5-pound bags. An inspector
suspects the bags may not contain 5 pounds. A sample
of 50 bags produces a mean of 4.6 pounds and a
standard deviation of 0.7 pound. Is there enough
evidence to conclude that the bags do not contain 5
pounds as stated, at  = 0.05? Also, find the 95%
confidence interval of the true mean.
STAT210: Probability and Statistics
26
Example
1.
2.
H0:  = 5
vs.
The test value is
t0 =
3.
4.
H1:  ≠ 5
4.6 − 5
0.7 / 50
= −4.04
P-value ≈ 0
The null hypothesis is rejected. There is enough
evidence to support the claim that the bags do not weigh
5 pounds.
One-Sample T
Test of mu = 5 vs not = 5
N
50
Mean
4.6000
StDev SE Mean
0.7000
0.0990
95% CI
(4.4011, 4.7989)
STAT210: Probability and Statistics
T
-4.04
P
0.000
27
Example

The 95% confidence interval for the mean is given by
x t
 /2, n −1
s
n
0.7
 4.6  (2.01)
50
One-Sample T
N
Mean
StDev SE Mean
50 4.6000 0.7000
0.0990

 (4.401, 4.799)
95% CI
(4.4011, 4.7989)
Notice that the 95% confidence interval of  does not
contain the hypothesized value = 5. Hence, there is
agreement between the hypothesis test and the
confidence interval.
STAT210: Probability and Statistics
28
Errors
Four scenarios when making a decision based on a sample
H 0 true
D o n o t reject H 0
Great!
H 0 false
Type II
Error
H 0 accepted
given H 0 false
P (Type II Error) = 
Reject H 0
Type I
Error
Great!
False Rejection of H 0
P (Type I Error) = 
(Significance level)
STAT210: Probability and Statistics
29
Error Types


A type I error occurs when the null hypothesis (H0)
is rejected when in fact H0 is true.
P( Type I error) = P( Reject H0|H0 is true)= 
A type II error occurs when we fail to reject the null
hypothesis (H0) when in fact H0 is false.
P( Type II error) = P( Accept H0|H0 is false)= 
Note that when  decreases  increases. If we
want to control  we need to increase the
sample size.
STAT210: Probability and Statistics
30
Tests for a Population Proportion




Our hypothesis test is similar to the one we
saw before. But now we have a sample that
consists of successes and failures, e.g.
“success” may be a defective wafer.
Population proportion of defective wafers is p
Supplier claims that the proportion of defective
wafers in his supply is less than 0.1 (or 10%),
i.e., p  0.1.
Since our hypothesis concerns a population
proportion, it is natural to base the test on the
sample proportion.
STAT210: Probability and Statistics
31
Tests on a population proportion (p)



Now, we consider testing hypotheses about a
population proportion p when the sample size n is
large.
We hypothesize that the population proportion p
equals some specified value p0 and we want to use the
data in a sample to test whether this null hypothesis is
appropriate or whether we should reject the null
hypothesis in favor of some alternative hypothesis.
To use the one-proportion z-test, we must have both
np0 ≥ 5 and n(1-p0) ≥ 5.
STAT210: Probability and Statistics
32
One-proportion z-test
1.
2.
Hypotheses
◼
H1: p > p0
Test Statistic
z0 =
3.
,
H1: p < p0
pˆ − p 0
p 0 (1 − p 0 ) n
or
H1: p  p0
x
where pˆ =
n
P-value and critical value:
H1
P-value
Critical Region
p>p0
P(Z>z0)
z0>z
p<p0
P(Z<z0)
z0<-z
p≠p0
2[P(Z>|z0|)]
|z0|>z/2
STAT210: Probability and Statistics
33
Example

Scientists think that robots will play an increasingly
crucial role in factories over the next 20 years.
Suppose that in an experiment to determine
whether the use of robots to weave computer
cables is feasible, a robot was used to assemble
500 cables. The cables were examined and 14
defectives were found. If human assemblers have
a defect rate of 3%, does this data support the
hypothesis that the rate of defectives is lower for
robots than for humans? Use  = 0.01.
STAT210: Probability and Statistics
34
Example
1.
Hypotheses:
◼
2.
H0: p = 0.03
np0=(500)(0. 03)=15>5 and n(1-p0)=485>5.
Test Statistic
14
pˆ =
= 0.028 and z 0 =
500
3.
4.
vs. H1: p < 0.03
0.028 − 0.03
(0.03)(0.97) / 500
= − 0.26
P-value =P(Z <-0.26)=0.3974
Since p-value=0.3974>0.01, we do not reject H0.
There is no sufficient evidence to conclude that the
rate of defectives is lower for robots than for
humans.
STAT210: Probability and Statistics
35
In Minitab
STAT210: Probability and Statistics
36
Example
Minitab output using one-proportion z-test:

◼
Test and CI for One Proportion
Test of p = 0.03 vs p < 0.03
95% Upper
Sample
X
N Sample p
Bound Z-Value
1
14 500 0.028000
0.040135
-0.26
P-Value
0.397
Using the normal approximation.
STAT210: Probability and Statistics
37
Exercise 1

A telephone company is trying to decide whether some
new lines in a large community should be installed
underground. Because a small surcharge will be added
to telephone bills to pay for the extra installation
costs, the company has decided to survey customers
and proceed only if the survey strongly indicates that
more than 60% of all customers favor underground
installation. If 118 of 160 customers surveyed favor
underground installation despite the surcharge, what
should the company do? Test the relevant hypothesis
using  = 0.05.
STAT210: Probability and Statistics
38
Exercise 2
An article presents a method for measuring
orthometric heights above sea level. For a sample
of 1225 baselines, 926 gave results that were
within the class C spirit leveling tolerance limits.
Can we conclude that more than 75% of the times this
method produces results within the tolerance limits?
STAT210: Probability and Statistics
39
Download