Statistics Chapter 9 Introduction to Statistical Tests

advertisement
Chapter 9 Hypothesis Testing
Note:
• In Chapter 8 we used methods of estimating
the value of a parameter.
• In this chapter, we are drawing inferences
about the parameter by making decisions
concerning the value of the parameter
Two Hypothesis:
• Null hypothesis ๐ป0 : This is the statement that is
under investigation or being tested. Usually the
null hypothesis represents a statement of “no
effect,” “no difference,” or, put another way,
“things haven’t changed.”
• Alternate hypothesis ๐ป1 : This is the statement
you will adopt in the situation in which the
evidence (data) is so strong that you reject ๐ป0 . A
statistical test is designed to assess the strength
of the evidence (data) against the null hypothesis.
Real Life Example:
• You are deciding if you want to breakup with your significant other. Let ๐œ‡
be the mean of your feeling for him/her.
• ๐‘๐‘ข๐‘™๐‘™ โ„Ž๐‘ฆ๐‘๐‘œ๐‘กโ„Ž๐‘’๐‘ ๐‘–๐‘  ๐ป0 : ๐œ‡ = 0 (meaning you have no negative feeling for
your significant other)
•
Alternate hypothesis ๐ป1 : ๐œ‡ < 0 (meaning you have some negative
feeling for your significant other)
• If you went through the “data”, and find out that you Do not reject the
null hypothesis then you guys should stay together
• If you went through the “data” and find out that you reject the null
hypothesis (accept the alternate hypothesis) then you guys should break
up.
Actual Statistic Example:
• Ford advertises that its new Fusion get 47 miles
per gallon. Let ๐œ‡ be the mean of the mileage
distribution for these cars. You assume that the
manufacturer will not underrate the car, but you
suspect that the mileage might be overrated.
• A) What is the null hypothesis?
• B) What is the alternate hypothesis?
Answer
• A) Null hypothesis:
– ๐ป0 : ๐œ‡ = 47 ๐‘š๐‘๐‘”
• B) Alternate Hypothesis:
– ๐ป1 : ๐œ‡ < 47 ๐‘š๐‘๐‘”
– We have every reason to believe that the
advertised mileage is too high. If the mean is not
47 mpg, then it is less than 47 mpg.
Group Work
• A company manufactures ball bearings for
precision machines. The average diameter of a
certain type of ball bearing should be 6.0 mm. To
check that the average diameter is correct, the
company formulates a statistical test.
• A) What is the null hypothesis? (Hint: what is the
company trying to test?)
• B) What is the alternate hypothesis? (Hint: If it’s
not precise, then it’s in trouble)
Answer
• A) ๐ป0 : ๐œ‡ = 6.0๐‘š๐‘š
• B) ๐ป1 : ๐œ‡ ≠ 6.0๐‘š๐‘š
Group Work
• A computer manufacturer averages 7%
defective part. To check that if the average is
correct, the company formulate a statistical
test.
• A) What is the null hypothesis?
• B) What is the alternate hypothesis?
Answer
• A) ๐ป0 : ๐œ‡ = 7%
• B) ๐ป1 : ๐œ‡ ≠ 7%
Note:
• How do you know to use <, >, ๐‘œ๐‘Ÿ ≠ in the
alternate hypothesis depends on the problem.
Read it and interpret it! It should be logical.
Types of statistical tests
• Assuming that ๐ป0 : ๐œ‡ = ๐‘˜
• A statistical test is:
– Left-tailed if ๐ป1 states that the parameter is less
than the value claimed in ๐ป0 (H1 : ๐œ‡ < ๐‘˜)
– Right-tailed if ๐ป1 states that the parameter is
greater than the value claimed in ๐ป0 (H1 : ๐œ‡ > ๐‘˜)
– Two-tailed if ๐ป1 states that the parameter is
different from (or not equal to) the value claimed
in ๐ป0 (H1 : ๐œ‡ ≠ ๐‘˜)
Hypothesis tests of ๐œ‡, Given x is
normal and ๐œŽ is known
• Given that x has a normal distribution with
known standard deviation ๐œŽ, then
• ๐‘ก๐‘’๐‘ ๐‘ก ๐‘ ๐‘ก๐‘Ž๐‘ก๐‘–๐‘ ๐‘ก๐‘–๐‘๐‘  = ๐‘ง =
๐‘ฅ−๐œ‡
๐œŽ/ ๐‘›
• Where ๐‘ฅ = mean of a simple random sample
• ๐œ‡ = value stated in ๐ป0
• n=sample size
Example:
•
Rosie is a dog. Let x be a random variable that represents Rosie’s heart rate. From
past experience, the vet knows that x has a normal distribution with ๐œŽ = 12. The
vet checked the Manual and found that for dogs have ๐œ‡ = 115 ๐‘๐‘’๐‘Ž๐‘ก๐‘  ๐‘๐‘’๐‘Ÿ ๐‘š๐‘–๐‘›.
•
•
Over the past six weeks, Rosie’s heart rates are:
93 109 110 89 112 117
•
The vet is concerned that Rosie’s heart rate may be slowing. Do the data indicate
that this is the case?
•
A) What’s the null and alternate hypothesis?
•
B) Compute the probability. (you have to find ๐‘ฅ, the use the formula and charts to
find the probability)
•
C) What’s the conclusion?
Answer
• A) ๐ป0 : ๐œ‡ = 115
• ๐ป1 : ๐œ‡ < 115
• B) We found out that ๐‘ฅ = 105.0
• ๐‘ง=
๐‘ฅ−๐œ‡
๐œŽ/ ๐‘›
=
105.0−115
12/ 6
≈ −2.04
• Using the z table ๐‘ƒ ๐‘ฅ < 105.0 = ๐‘ƒ ๐‘ง < −2.04 = .0207
• C) If H0 : μ = 115 is in fact true, the probability of getting a sample mean
of ๐‘ฅ ≤ 105.0 is about 2%. Because this probability is small, we reject null
hypothesis and conclude that alternate hypothesis ๐œ‡ < 115. The average
heart rate seems to be slowing.
• Although since probability is so small, it doesn’t necessary prove null to be
false and alternate to be true.
P-Value
• Assuming ๐ป0 is true, the probability that the
test statistic will take on values as extreme as
or more extreme than the observed test
statistic (computed from sample data) is called
the P-value of the test. The smaller the Pvalue computed from sample data, the
stronger the evidence against ๐ป0
Look in the book and copy Pg 405 and
406 graphes
Types of error
• There are two types of error. Type I and Type
II
Our decision
Our decision
Truth of ๐ป0
If we do not reject ๐ป0 If we reject ๐ป0
If ๐ป0 is true
Correct decision; no
error
Type I error
If ๐ป0 is false
Type II error
Correct decision; no error
Level of significance ๐›ผ
• This is the probability of rejecting ๐ป0 when it is
true. This is the probability of a type I error.
• Important!!!
• If ๐‘ƒ − ๐‘ฃ๐‘Ž๐‘™๐‘ข๐‘’ ≤ ๐›ผ, we reject the null hypothesis
and say the data are statistically significant at the
level ๐›ผ.
• If ๐‘ƒ − ๐‘ฃ๐‘Ž๐‘™๐‘ข๐‘’ > ๐›ผ, we do not reject the null
hypothesis.
Probabilities associated with a
statistical test
Our decision
Our decision
Truth of ๐ป0
If we do not reject ๐ป0
If we reject ๐ป0
If ๐ป0 is true
Correct decision; with
corresponding probability
1−๐›ผ
Type I error, with
corresponding probability
๐›ผ, called the level of
significance of the test
If ๐ป0 is false
Type II error, with
corresponding probability
๐›ฝ
Correct decision; with
corresponding probability
1 − ๐›ฝ, called the power of
the test
Power of the test 1 − ๐›ฝ
• This is the probability of rejecting ๐ป0 when it
is in fact, false.
Note:
• We usually choose ๐›ผ first.
• 1) increasing ∝ (level of significance) increases the 1
− ๐›ฝ (power of the test). Meaning we will more likely
reject the null hypothesis when it is false.
• 2) increasing ∝ (level of significance) also increases the
probability of a type I error. Usually we want to use a
small ∝. It means that we are usually more willing to
make an error by failing to reject a claim (๐ป0 ) than to
make an error by accepting another claim (๐ป1 ) that is
false.
Example:
• Consider the ball bearing problem.
• ๐ป0 : ๐œ‡ = 6.0 mm
• ๐ป1 : ๐œ‡ ≠ 6.0 mm
• A) Suppose the manufacturer requires a 1% level
of significance. Describe a type I error, its
consequence, and its probability
• B) Discuss a type II error and its consequences
Answer
• A) type I error is when we should reject the null
when, in fact, the average diameter of the ball
bearing being produced is 6.0 mm. Type I error
will cause a needless adjustment and delay of the
manufacturing process. The probability is 1%
because alpha is .01
• B) type II error is when we accept the null when it
is in fact, false. It means that the bearings are
either too large or too small to meet
specifications. It could lead to product call-backs.
Example:
• Let x be a random variable representing dividend yield
of Australian bank stocks. We may assume that x has a
normal distribution with ๐œŽ = 2.4%. A random sample
of 10 Australian bank stocks gave the following yields.
• 5.7 4.8 6.0 4.9 4.0 3.4 6.5 7.1 5.3 6.1
• For the entire Australian stock market, the mean
dividend yield is ๐œ‡ = 4.7% Do these data indicate that
the dividend yield of all Australian bank stocks is higher
than 4.7%? Use ๐›ผ = .01
Answer
• You first identify your variables
• ๐›ผ = .01; ๐ป0 : ๐œ‡ = 4.7%; ๐ป1 : ๐œ‡ > 4.7%; ๐‘Ÿ๐‘–๐‘”โ„Ž๐‘ก ๐‘ก๐‘Ž๐‘–๐‘™๐‘’๐‘‘
• Then you find ๐‘ฅ
• ๐‘ฅ = 5.38%, using ๐‘ง =
๐‘ฅ−๐œ‡
๐œŽ/ ๐‘›
=
5.38−4.7
2.4/ 10
= .90
• Z=.90, which means p=.8159, since it is right-tailed we have to
subtract from 1, so p=.1841.
• Because .1841 > .01, we fail to reject the null hypothesis
• It means that there is insufficient evidence at the 0.01 level to
reject claim that average yield for bank stocks equals average yield
for all stocks
Group Work
• You are investigating the weight of a cereal box. A
random sample of six cereal box reveals the following
weigh in grams:
• 3.7 2.9 3.8 4.2 4.8 3.1
• Let x be a random variable representing the weights of
all the cereal boxes. We assume that x has a normal
distribution and ๐œŽ = 0.70 ๐‘”๐‘Ÿ๐‘Ž๐‘š. It is known that the
mean weight of the cereal box is ๐œ‡ = 4.55. Do the
data indicate that the mean weight of these cereal box
is less than 4.55 grams? Use ๐›ผ = 0.01
Answer
• You first identify your variables
• ๐›ผ = .01; ๐ป0 : ๐œ‡ = 4.55; ๐ป1 : ๐œ‡ < 4.55; ๐‘™๐‘’๐‘“๐‘ก ๐‘ก๐‘Ž๐‘–๐‘™๐‘’๐‘‘
• Then you find ๐‘ฅ
• ๐‘ฅ = 3.75, using ๐‘ง =
๐‘ฅ−๐œ‡
๐œŽ/ ๐‘›
=
3.75−4.55
.70/ 6
= −2.80
• Z=-2.80, which means p=.0026, since it is left tailed, we keep the
number.
• Because .0026 < .01, we reject the null hypothesis
• It means that there is sufficient evidence at the 0.01 level to reject
the null of 4.55 gram and accept the alternate hypothesis that the
cereal box has a lower average.
Group Work
• Water usually contain ammonia nitrogen. For many years,
the concentration has been 2.1 mg/l. Due to acid rains,
residents are worried that the rain led to increased the
level of ammonia nitrogen. Let x be a random variable
representing ammonia nitrogen concentration. Based on
recent studies of the water, we can assume that x has a
normal distribution with ๐œŽ = .27 Recently, a random
sample of eight water test are the following:
• 2.5 2.7 3.1 2.8 3.0 2.2 2.9 2.5
• Do the data indicate that the mean concentration is greater
than 2.1 mg/l? Use ๐›ผ = .01
Group Work
• Nationally, about 43% of all car accident is caused
by teenagers. An insurance company is studying
damage claims (in %) in California. A random
sample of 12 samples gave the following data:
• 50 64 34 26 53 27 24 79 42 43 13 54
• Assume that x has a normal distribution and ๐œŽ
= 8%
• Do these data indicate that the percentage of car
accidents in California is different than the
national mean? Use ๐›ผ = .05
Homework Practice
• Pg 411 #1-14 even
TESTING THE MEAN ๐
Summary so far…
• 1) We first state the proposed value for a population parameter in the null
hypothesis ๐ป0 . The alternate hypothesis ๐ป๐ด states alternative values of
the parameter, either <, >, ๐‘œ๐‘Ÿ ≠ the value proposed in ๐ป0 . We also set
level of significance ๐›ผ. This is the risk we are willing to take of committing
a type I error. That is, ๐›ผ is the probability of rejecting ๐ป0 when it is, in fact,
true.
• 2) We then use corresponding sample statistic to challenge the statement
in ๐ป0 . We convert sample statistic to a test statistic, which corresponding
value of the appropriate sampling distribution
• 3) We compute the P-value of the statistic, P-value is the probability of
getting a sample statistic as extreme as or more extreme than the
observed statistic from our random sample.
• 4) Conclusion. If the P-value is very small, we have evidence to reject ๐ป0
and adopt ๐ป๐ด . If P-value ≤ ๐›ผ then we say we have evidence to reject ๐ป0
and adopt ๐ป๐ด . Otherwise, we say that the sample evidence is insufficient
to reject ๐ป0
• 5) Interpret the result
Example: Testing ๐œ‡, ๐œŽ known (you
should know how to do this)
• Let x be a random variable representing the number of sunspots
observed in a four-week period. A random sample of 40 such
periods from Spanish colonial times gave the following data:
• 12.5 14.1
27.4 53.5
65 134.7
45.3 61.0
37.6
73.9
114.0
39.0
48.3 67.3 70.0 43.8 56.5 59.7 24.0 12.0
104.0 54.6 4.4 177.3 70.1 54.0 28.0 13.0
72.7 81.2 24.1 20.4 13.3 9.4 25.7 50.0
12.0 7.25 11.3
• ๐‘ฅ = 47.0 Previous studies of sunspot activity during this period
indicate that ๐œŽ = 35. It is thought that for thousands of years, the
mean number of sunspots per four-week period was about ๐œ‡ = 41.
Do the data indicated that the mean sunspot activity during Spanish
colonial period was higher than 41? Use ๐›ผ = 0.05
Answer
• ๐ป0 : ๐œ‡ = 41
• ๐ป๐ด : ๐œ‡ > 41
• ๐‘ง=
๐‘ฅ−๐œ‡
๐œŽ/ ๐‘›
=
47−41
35/ 40
≈ 1.08
• P-value=P(๐‘ง > 1.08) ≈ 0.1401
• Since .1401 > .05 we do not reject ๐ป0
• At the 5% level of significance, the evidence is not sufficient to
reject ๐ป0 . Based on the sample data, we do not think the average
sunspot activity during the Spanish colonial period was higher than
the long-term mean.
Testing ๐œ‡ ๐‘คโ„Ž๐‘’๐‘› ๐œŽ ๐‘–๐‘  ๐‘ข๐‘›๐‘˜๐‘›๐‘œ๐‘ค๐‘›
• 1) In the context of the application, state the null and alternate
hypothesis and set the level of significance ๐›ผ
• 2) If you can assume that x has a normal distribution or simply has a
mound-shaped symmetric distribution, then any sample size n will
work. If you cannot assume this, then use a sample size ๐‘› ≥ 30.
• ๐‘ก=
๐‘ฅ−๐œ‡
๐‘ / ๐‘›
with degrees of freedom d.f.= n-1
• 3) Use student’s t-distribution and the type of test, one-tailed or
two-tailed, to find (or estimate) the P-value corresponding to the
test statistic.
• 4) Conclude the test. If P-value ≤ ๐›ผ then we say we have evidence
to reject ๐ป0 and adopt ๐ป๐ด . Otherwise, we say that the sample
evidence is insufficient to reject ๐ป0
• 5) Interpret your conclusion
Key:
• Use one-tail area to estimate P-value for lefttailed tests
• Use one-tail area to estimate P-value for righttailed tests
• Use two-tail area to estimate P-value for twotailed tests.
Example:
• The drug 6-mP is used to treat leukemia. The following
data represent the remission times (in weeks) for a random
sample of 21 patients using 6-mP.
• 10 7 32 23 22 6 16 34 32 25 11 20 19 6 17
35 6 13 9 6 10
• Assume the x distribution is mound-shaped and symmetric.
A previously used drug treatment had a mean remission
time of ๐œ‡ = 12.5 weeks.
• Do the data indicate that the mean remission time using
the drug 6-mP is different from 12.5 weeks? Use ๐›ผ = 0.01
Answer
•
•
๐ป0 : ๐œ‡ = 12.5
๐ป๐ด : ๐œ‡ ≠ 12.5
•
๐‘ก≈
•
•
d.f.=21-1=20
The sample statistic t=2.108 falls between 2.086 and 2.528
•
0.02<P-value<0.05, since it is > than 0.01, we do not reject the null.
•
Even though it does not give us that specific p-value, it does give a range that contains the specific
P-value. As the diagram shows, the entire range is greater than ๐›ผ. This means we cannot reject ๐ป0
•
So at the 1% level of significance, the evidence is not sufficient to reject the null. We cannot say
that the drug 6-mP provides a different average remission time than the previous drug.
๐‘ฅ−๐œ‡
๐‘ 
๐‘›
≈
17.1−12.5
10.0
21
≈ 2.108
Group Work
• Suppose the length of projectile points at a certain
archaeological site have mean length ๐œ‡ = 2.6 ๐‘๐‘š. A
random sample of 11 recently discovered projectile
points in an adjacent cliff dwelling gave the following
length:
• 3.1 4.1 1.8 2.1 2.2 1.3 1.7 3.0 3.7 2.3 2.6
• Do these data indicate that the mean length of
projectile points in the adjacent cliff dwelling is longer
than 2.6? Use ๐›ผ = .01
Group Work
• USA Today reported that the state with the
longest mean life span is Hawaii where the
population mean life span is 81 years. A random
sample of 15 obituary gave the following
information about life span of the residents: 72
68 91 85 80 68 56 93 47 86 97 77 69
87 47
• Does the information indicate that the population
mean life is less than 81 years? Use a 5% level of
significance
Group Work
• A good way to measure value of a company is
the P/E or price to earning ration. High P/E
may indicate a stock is overpriced. For the
S&P Stock Index of all major stocks, the mean
P/E ration is ๐œ‡ = 19.4. A random sample of
36 pharmaceutical stocks gave a P/E ratio of
๐‘ฅ = 17.5 with ๐‘  = 6.1. Does this indicate that
the mean P/E ratio of all pharmaceutical
stocks is different than the mean of S&P
Stocks? Use ๐›ผ = .05
Testing ๐œ‡ Using Critical Regions
(Traditional Method)
• It is very very very similar to what we have
been doing, but we are just comparing the
critical values.
Hypothesis Testing,
Critical Values ๐’๐ŸŽ
• ๐‘ง=
๐‘ฅ−๐œ‡
๐œŽ/ ๐‘›
Level of Significance
๐›ผ = 0.05
๐›ผ = 0.01
Critical value ๐‘0 for
a left-tailed test
-1.645
-2.33
Critical value ๐‘0 for
a right-tailed test
1.645
2.33
Critical value ±๐‘0
for a two-tailed test
±1.96
±2.58
Continue
• A) for a left-tailed test,
– i. if sample test statistic ≤ critical value, reject ๐ป0
– ii. If sample test statistic > critical value, fail to reject ๐ป0
• B) for a right tailed test,
– i. if sample test statistic ≥ critical value, reject ๐ป0
– ii. If sample test statistic < critical value, fail to reject ๐ป0
• C) for a two-tailed test,
– i. if sample test statistic lies beyond critical values, reject
๐ป0
– ii. If sample test statistic lies between critical values, fail to
reject ๐ป0
Example: (from previous example)
• Let x be a random variable representing the number of sunspots
observed in a four-week period. A random sample of 40 such
periods from Spanish colonial times gave the following data:
• 12.5 14.1
27.4 53.5
65 134.7
45.3 61.0
37.6
73.9
114.0
39.0
48.3 67.3 70.0 43.8 56.5 59.7 24.0 12.0
104.0 54.6 4.4 177.3 70.1 54.0 28.0 13.0
72.7 81.2 24.1 20.4 13.3 9.4 25.7 50.0
12.0 7.25 11.3
• ๐‘ฅ = 47.0 Previous studies of sunspot activity during this period
indicate that ๐œŽ = 35. It is thought that for thousands of years, the
mean number of sunspots per four-week period was about ๐œ‡ = 41.
Do the data indicated that the mean sunspot activity during Spanish
colonial period was higher than 41? Use ๐›ผ = 0.05
Answer
• Because it is a right tailed test, and you got
z=1.08. Since 1.08 <1.645. We fail to reject
๐ป0
TI 83/TI 84 Calculator
• In your calculator, press Stat, select Tests, and
use option 1:Z-Test when the question is
appropriate and 2: T-Test when the question is
appropriate
Homework Practices
• Pg 426 #1-21 odd
TESTING A PROPORTION ๐†
Intro
• Many situations arise that call for tests of
proportions or percentages rather than
means. For example, a college registrar may
want to determine if the proportion of
students wanting 3-weeks intensive courses
has increased.
Note:
• In this section, we will assume that the
situations we are dealing with satisfy the
conditions underlying the binomial
distribution. r is the number of successes out
๐‘Ÿ
of n trials. ๐‘ = ๐‘ž ๐‘Ÿ๐‘’๐‘๐‘Ÿ๐‘’๐‘ ๐‘’๐‘›๐‘ก๐‘ 
๐‘›
๐‘กโ„Ž๐‘’ ๐‘๐‘œ๐‘๐‘ข๐‘™๐‘Ž๐‘ก๐‘–๐‘œ๐‘› ๐‘๐‘Ÿ๐‘œ๐‘๐‘Ž๐‘๐‘–๐‘™๐‘–๐‘ก๐‘ฆ
๐‘œ๐‘“ ๐‘“๐‘Ž๐‘–๐‘™๐‘ข๐‘Ÿ๐‘’,
๐‘ค๐‘’ ๐‘Ž๐‘™๐‘ ๐‘œ ๐‘Ž๐‘ ๐‘ ๐‘ข๐‘š๐‘’ ๐‘›๐‘ > 5 ๐‘Ž๐‘›๐‘‘ ๐‘›๐‘ž > 5
Proportion ๐œŒ
• ๐‘ง=
๐‘−๐œŒ
๐œŒ๐‘ž
๐‘›
๐‘Ÿ
๐‘›
• ๐‘ = is the sample test statistic
• n=number of trials
• ๐œŒ= proportion specified in ๐ป0
• q=1-๐œŒ
The different tails
• Left-tailed test
• ๐ป0 : ๐‘ = ๐‘˜
• ๐ป๐ด : ๐‘ < ๐‘˜
• Right-tailed test
• ๐ป0 : ๐‘ = ๐‘˜
• ๐ป๐ด : ๐‘ > ๐‘˜
• Two-tailed test
• ๐ป0 : ๐‘ = ๐‘˜
• ๐ป๐ด : ๐‘ ≠ ๐‘˜
Example:
• A team of eye surgeons has developed a new
technique for a risky eye operation to restore the sight
of people blinded from a certain disease. Under the
old method, it is known that only 30% of the patients
who undergo this operation recover their eyesight.
• Suppose that surgeons in various hospitals have
performed a total of 225 operations using the new
methods and that 88 have been successful. Can we
justify the claim that the new method is better than
the old one? Use 1% level of significance
Answer
•
•
•
๐ป0 : ๐‘ = .30
๐ป๐ด : ๐‘ > .30
๐›ผ = .01
•
•
•
•
๐‘=
= .39
225
p= .30
q=.70
n=225
•
๐‘ง=
•
P(z>2.95)≈ 0.0016
•
•
Since 0.0016<.01, we reject the null and accept the alternate
At the 1% level of significance, the evidence shows that the population probability of success for
the new surgery technique is higher than that of the old technique.
88
๐‘−๐‘
๐‘๐‘ž
๐‘›
=
.39−.30
.30(.70)
225
≈ 2.95
Group Work
• A botanist has produced a new variety of hybrid
wheat that is better able to withstand drought
than other varieties. The botanist knows that for
the parent plants, the proportion of seeds
germinating is 80%. The proportion of seeds
germinating for the hybrid variety is unknown,
but the botanist claims it is 80%. To test this
claim, 400 seeds from the hybrid plant are tested,
and it is found that 312 germinate. Use a 5%
level of significance to test the claim that
proportion germinating for the hybrid is 80%
Group Work
• A recent study claims that 60% of SEC
investigation will be dropped. Suppose 500
cases has been reported and 384 cases were
dropped. Can we justify that these cases were
dropped more often than previously claimed?
Use 5% level of significance.
Group Work
• If you watched Dragon Ball Z, you would know
that Super Saiyans are a rare event. It is
claimed that 2% of the population in Planet
Vegeta are Super Saiyans. A recent study of
600 families, 30 of which claimed that their
son/daughter is a Super Saiyan. Do these
claims signal that the families these days
create more Super Saiyans than the
population states? Use 5% level of
significance.
Note:
• The central question in hypothesis testing is
whether or not you think the value of the sample
test statistic is too far away from the value of the
population parameter proposed in ๐ป0 to occur by
chance alone.
• When you reject the null, are you absolutely
certain that you are making a correct decision?
The answer is NO! You are simply willing to take a
chance that you are making a type I error.
Note cont.
• 1) What if the P-value is so close to ๐›ผ that we
“barely” reject or fail to reject the null? In
such cases, researchers might attempt to
clarify the results by
• Increasing the sample size
• Controlling the experiment to reduce the
standard deviation
Note cont. 2
• 2) How reliable is the study and the measurements in
the sample?
• When reading results of statistical study, be aware of
the source of the data and the reliability of the
organization doing the study
• Is the study sponsored by an organization that might
profit or benefit from the stated conclusions? If so,
look at the study carefully to ensure that the
measurements, sampling technique, and handling of
data are proper and meet professional standards.
Homework Practice
• Pg 437 #1-22 eoe
TESTS INVOLVING PAIRED DIFFERENCES
(DEPENDENT SAMPLES)
Note:
• Many statistical applications use paired data
samples to draw conclusions about the difference
between two population means. Data pairs occur
very naturally in “before and after” situations,
where the same object or item is measured both
before and after a treatment.
• Example: Psychological studies of identical twins;
biological studies of plant growth on plots of land
matched for soil type, moisture, and sun, etc
Example:
• A shoe manufacturer claims that among the general
population of adults in the United States, the average
length of the left foot is longer than that of the right. To
compare the average length of the left foot with that of the
right, we can take a random sample of 15 U.S. adults and
measure the length of the left foot and then the length of
the right foot for each person in the sample. Is there a
natural way of pairing the measurements? How many pairs
will we have?
• Answer: We can pair each left foot measurement with the
same person’s right foot measurement. The person serves
as the “matching link” between the two distributions. We
will have 15 pairs of measurements.
How to Test Paired Differences Using
the Student’s t distribution
•
Obtain a simple random sample of n matched data pairs A, B. Let ๐‘‘ be a random variable
representing the difference between the values in a matched data pair. Compute the sample mean
๐‘‘ and sample standard deviation ๐‘ ๐‘‘
•
1) Use the null hypothesis of no difference, ๐ป0 : ๐œ‡๐‘‘ = 0. In the context of the application, choose
the alternate hypothesis to be ๐ป๐ด : ๐œ‡๐‘‘ > 0, ๐œ‡๐‘‘ < 0, ๐‘œ๐‘Ÿ ๐œ‡๐‘‘ ≠ 0. Set the level of significance ๐›ผ.
•
2) If you can assume that ๐‘‘ has a normal distribution or simply has a mound-shaped symmetric
distribution, then any sample size n will work. If you cannot assume this, then use a sample size ๐‘›
≥ 30. Use ๐‘‘, ๐‘ ๐‘‘ , ๐‘› ๐‘Ž๐‘›๐‘‘ ๐œ‡๐‘‘ = 0 from the null hypothesis to compute the sample test statistic.
•
๐‘ก=
•
With d.f.= n-1
•
•
•
3) Determine the tail
4) Conclude the test by comparing p-value to ๐›ผ
5) Interpret
๐‘‘−0
๐‘ ๐‘‘ / ๐‘›
=
๐‘‘ ๐‘›
๐‘ ๐‘‘
Example:
• A team of heart surgeons at Saint Ann’s Hospital knows that many
patients who undergo corrective heart surgery have a dangerous
buildup of anxiety before their scheduled operations. The staff
psychiatrist at the hospital has started a new counseling program
intended to reduce this anxiety. A test of anxiety is given to
patients who know they must undergo heart surgery. Then each
patient participates in a series of counseling sessions with the staff
psychiatrist. At the end of the counseling sessions, each patient is
retested to determine anxiety level. Higher scores mean higher
levels of anxiety. From the given data, can we conclude that the
counseling sessions reduce anxiety? Use 1% level of significance.
• Chart on the next slide
Patient
B
Score before
Counseling
A
Score after
counseling
d=B-A Difference
1
121
76
45
2
93
93
0
3
105
64
41
4
115
117
-2
5
130
82
48
6
98
80
18
7
142
79
63
8
118
67
51
9
125
89
36
Answer
•
•
•
๐ป0 : ๐œ‡๐‘‘ = 0
๐ป๐ด : ๐œ‡๐‘‘ > 0 (remember positive difference means reducing stress)
๐›ผ = 0.01
•
•
๐‘‘ ≈ 33.33
๐‘ ๐‘‘ ≈ 22.92
•
๐‘ก=
•
0.0005<P-value<0.005
•
Since it is in between those two number, it is less than .01, therefore we reject the null hypothesis
and accept the alternate hypothesis.
•
Based on 1% level of significance, we determined that going to counseling sessions reduce anxiety
because we reject the notion that the difference between before and after is 0 and take a chance at
the fact they are greater than 0 by accepting the alternate hypothesis.
๐‘‘−0
๐‘ ๐‘‘ / ๐‘›
≈
33.33
22.92/ 9
≈4.363
Group Work
• Do Educational toys make a difference in the age
at which child learns to read? To study this
question, researcher designed an experiment in
which one group of preschool children spent 2
hours each day in a room well supplied with
educational toys. A control group of children
spent 2 hours a day in a noneducational toy
room. It was anticipated that IQ differences and
home environment might be uncontrollable
factors unless identical twins could be used. Here
is the chart. Use 1% level of significance.
Ages are in months
Twin Pair
Experimental
Group
B = Reading
Age
Control
Group
A = reading
Age
1
58
60
2
61
64
3
53
52
4
60
65
5
71
75
6
62
63
Difference
d=B-A
Group Work
• Athersys is a company that uses Multistem to
treat patients. Does Multistem really reduce
inflammation in a disease? (IBD or UC). The
company designed an experiment in which
one group receives the Multistem treatment
and control group receive a placebo. Here are
the results of 6 patients. Use 5% level of
significance.
Mayo Score is from 0-5 0 is the best
and 5 is the worst
Trial
Experimental Control
Group
Group
B = multistem A = placebo
1
2
3
2
1
4
3
2
2
4
3
4
5
3
3
6
0
2
Difference
d=B-A
Group Work
• Are America’s top CEO really worth all that
money? One way to answer this question is to
look at the annual company percentage
increase in revenue (B), versus CEO’s annual
percentage salary increase (A). Do these data
indicate that the population mean percentage
increase in corporate revenue different from
the population mean percentage increase in
CEO? Use 5% level of significance
B:
24
23
25
18
6
4
21
37
A:
21
25
20
14
-4
19
15
30
Homework Practice
• Pg 449# 1-19 eoe
TESTING ๐๐Ÿ − ๐๐Ÿ AND ๐†๐Ÿ − ๐†๐Ÿ
(INDEPENDENT SAMPLES)
Note:
Last section we talked about how to calculate 2
DEPENDENT samples.
In this section, we will turn our attention to tests
of differences of means from INDEPENDENT
samples. We will see new techniques for testing
the difference of means from independent
sample.
Note:
• There will be three situations
– Testing ๐œ‡1 − ๐œ‡2 when ๐œŽ1 and ๐œŽ2 are known
– Testing ๐œ‡1 − ๐œ‡2 when ๐œŽ1 and ๐œŽ2 are unknown
– Testing ๐œŒ1 − ๐œŒ2
Group Work
• What is the difference between Independent
samples and dependent samples?
Group Work
• Determine if the situation is dependent or
independent.
• A teacher wishes to compare the effectiveness of two
teaching methods. Students are randomly divided into
two groups; The first group is taught by direct
instruction. The second group is taught by studentdirected learning. At the end of the course, a
comprehensive exam is given to all students, and the
mean score ๐‘ฅ1 is compared with ๐‘ฅ2 . Are the samples
independent or dependent? Why?
Answer
• Independent because they were randomly
divided into two groups.
Group Work
• Determine if the situation is dependent or
independent.
• Shoe manufacturer claimed that for the general
population of adult US citizens, the average length of
the left foot is longer than the average length of the
right foot. To study this claim, the manufacturer
gathers data in this fashion: Sixty adult US citizens are
drawn at random and for these 60 people, both left
and right feet are measured. Let ๐‘ฅ1 be the mean
length of the left and ๐‘ฅ2 be the mean length of the
right feet. Are they independent or dependent?
Answer
• Dependent, usually the person’s left feet are
related to the right feet. Also, they are paired.
How to test ๐œ‡1 − ๐œ‡2 when ๐œŽ1 and ๐œŽ2 are known
•
Let ๐œŽ1 and ๐œŽ2 be the population standard deviations of populations 1 and 2. Obtain two
independent random samples from populations 1 and 2, where
–
–
•
1. In the context of the application, state the null and alternate hypothesis and set the level of
significance. It is customary to use
–
•
•
•
๐ป0 : ๐œ‡1 − ๐œ‡2 = 0
2. If you can assume that both population distributions 1 and 2 are normal, any sample sizes ๐‘›1 and
๐‘›2 will work. If you cannot assume this, then use samples sizes greater than 30 for both samples.
–
•
๐‘ฅ1 and ๐‘ฅ2 are sample means from populations 1 and 2
๐‘›1 and ๐‘›2 are the sample sizes from populations 1 and 2
๐‘ง=
๐‘ฅ1 −๐‘ฅ2 −(๐œ‡1 −๐œ‡2 )
2
๐œŽ2
1 +๐œŽ2
๐‘›1 ๐‘›2
3. Use the standard normal distribution and the type of test, one-tailed or two-tailed, to find the pvalue
4. Conclude
5. interpret
How to test ๐œ‡1 − ๐œ‡2 when ๐œŽ1 and ๐œŽ2
are unknown
•
Obtain two independent random samples from populations 1 and 2, where
–
–
–
•
1. In the context of the application, state the null and alternate hypothesis and set the level of
significance. It is customary to use
–
•
•
•
๐ป0 : ๐œ‡1 − ๐œ‡2 = 0
2. If you can assume that both population distributions 1 and 2 are normal, any sample sizes ๐‘›1 and
๐‘›2 will work. If you cannot assume this, then use samples sizes greater than 30 for both samples.
–
•
๐‘ฅ1 and ๐‘ฅ2 are sample means from populations 1 and 2
๐‘ 1 and ๐‘ 2 are sample standard deviations from populations 1 and 2
๐‘›1 and ๐‘›2 are the sample sizes from populations 1 and 2
t=
๐‘ฅ1 −๐‘ฅ2 −(๐œ‡1 −๐œ‡2 )
2
๐‘ 2
1 + ๐‘ 2
๐‘›1 ๐‘›2
, d.f= (remember you use the smaller of ๐‘›1 − 1 and ๐‘›2 − 1)
3. Use the standard normal distribution and the type of test, one-tailed or two-tailed, to find the pvalue
4. Conclude
5. interpret
Note:
• Null hypotheses
• ๐ป0 : ๐œ‡1 − ๐œ‡2 = 0 or H0 : ๐œ‡1 = ๐œ‡2
Note:
• Alternate hypotheses and the type of test
• ๐ป๐ด : ๐œ‡1 − ๐œ‡2 < 0 or HA : ๐œ‡1 < ๐œ‡2 left tailed test
• ๐ป๐ด : ๐œ‡1 − ๐œ‡2 > 0 or HA : ๐œ‡1 > ๐œ‡2 right tailed test
• ๐ป๐ด : ๐œ‡1 − ๐œ‡2 ≠ 0 or HA : ๐œ‡1 ≠ ๐œ‡2 two-tailed test
Special situation: pooled 2-sample
procedure
• In this situation, even though you know it’s independent
but you have reasons to believe ๐œŽ1 = ๐œŽ2
• An example is the weight of Asians in Asia vs Asians in U.S.
• ๐‘ก=
๐‘ฅ1 −๐‘ฅ2
๐‘ 
1
1
+
๐‘›1 ๐‘›2
, with d.f=๐‘›1 + ๐‘›2 − 2
• Pooled standard deviation s is
• ๐‘ =
๐‘›1 −1 ๐‘ 12 + ๐‘›2 −1 ๐‘ 22
๐‘›1 +๐‘›2 −2
How to test a difference of proportions
๐œŒ1 − ๐œŒ2
•
•
•
•
Consider two independent binomial experiments
Binomial Experiment (for both experiment 1 and experiment 2)
๐‘›1,2 =number of trials
๐‘Ÿ1,2 =number of successes
•
๐‘1,2 =
•
๐œŒ1,2 = population probability of success on a single trial
•
•
1. Use null hypothesis of no difference ๐ป0 : ๐œŒ1 − ๐œŒ2 = 0 and level of significance
2. The pooled best estimates for population probability of success and failure are
๐‘Ÿ1,2
๐‘›1,2
๐‘Ÿ +๐‘Ÿ
–
๐‘ = ๐‘›1 +๐‘›2 ๐‘Ž๐‘›๐‘‘ ๐‘ž = 1 − ๐‘
–
z=
–
–
–
3. Determine the tail
4. Conclude
Interpret
1
๐‘1 −๐‘2
๐‘๐‘ž ๐‘๐‘ž
+
๐‘›1 ๐‘›2
2
remember (๐‘›1 ๐‘, ๐‘›1 ๐‘ž, ๐‘›2 ๐‘, ๐‘›2 ๐‘ž ๐‘Ž๐‘™๐‘™ โ„Ž๐‘Ž๐‘ฃ๐‘’ ๐‘ก๐‘œ ๐‘๐‘’ ๐‘”๐‘Ÿ๐‘’๐‘Ž๐‘ก๐‘’๐‘Ÿ ๐‘กโ„Ž๐‘Ž๐‘› 5)
Example: (when ๐œŽ is known)
• A consumer group is testing camp stoves. To test the heating
capacity of a stove, it measures the time required to bring 2 quarts
of water from 50F to boiling. Two competing models are under
consideration. Ten stoves of the first model and 12 stoves of the
second model are tested. The following results are obtained
• Model 1: ๐‘ฅ1 = 11.4 ๐‘š๐‘–๐‘›, ๐œŽ1 = 2.5 ๐‘š๐‘–๐‘›; ๐‘›1 = 10
• Model 2: ๐‘ฅ2 = 9.9 ๐‘š๐‘–๐‘›; ๐œŽ2 = 3.0 ๐‘š๐‘–๐‘›; ๐‘›2 = 12
• Assume that the time required to bring water to a boil is normally
distributed for each stove. Is there any difference between the
performances of these two models? Use 5% level of significance.
Group Work
• A teacher wish to compare the two teaching
methods. The first group consists of 49
students with a mean score of 74.8 points.
The second group has 50 students with a
mean score of 81.3 points. The teacher claims
that the second method will increase the
mean score on the exam. Is this claim justified
at the 5% level of significance? Earlier
research for the two methods indicates that
๐œŽ1 = 14 ๐‘๐‘œ๐‘–๐‘›๐‘ก๐‘  ๐‘Ž๐‘›๐‘‘ ๐œŽ2 = 15 ๐‘๐‘œ๐‘–๐‘›๐‘ก๐‘ 
Example: (when ๐œŽ is unknown)
• Two competing headache remedies claim to give fast-acting relief. An
experiment was performed to compare the mean lengths of time required
for bodily absorption of brand A and brand B headache remedies.
• 12 people were randomly selected and given brand A, another 12 were
randomly selected and given brand B. The lengths of time in minutes for
the drugs to reach a specified level in the blood were recorded.
• Brand A: ๐‘ฅ1 = 21.8 ๐‘š๐‘–๐‘›; ๐‘ 1 = 8.7 ๐‘š๐‘–๐‘›, ๐‘›1 = 12
• Brand B: ๐‘ฅ2 = 18.9 ๐‘š๐‘–๐‘›; ๐‘ 2 = 7.5 ๐‘š๐‘–๐‘›; ๐‘›2 = 12
• Let Past experience with the drug composition of the two remedies
permits researchers to assume that both distributions are normal. Let us
use a 5% level of significance to test the claim that there is no difference in
the mean time for bodily absorption. Find the P-value and evaluate the
two drugs.
Group Work
• Suppose the experiment to measure the times in
minutes for the headache remedies to enter the
bloodstream yielded sample means, sample
stand deviations and sample sizes as follows:
• Brand A: ๐‘ฅ1 = 20.1 ๐‘š๐‘–๐‘›; ๐‘ 1 = 8.7 ๐‘š๐‘–๐‘›, ๐‘›1 = 12
• Brand B: ๐‘ฅ2 = 13.4 ๐‘š๐‘–๐‘›; ๐‘ 2 = 7.6 ๐‘š๐‘–๐‘›; ๐‘›2 = 8
• Brand B claims to be faster. Is this claim justified
at the 1% level of significance?
Example: Difference of proportions
•
CCHS wants to improve student involvement. One method under consideration is
to send reminders through texts to all students in the school to participate in
school events. As part of the pilot study to determine if this method will actually
improve student involvement, a random sample of 1250 students are taken. Then
it is divided into two groups;
•
Group 1: 625 students. No reminder of school events. The number of participants
are 125
•
Group 2: 625 students. Reminders were sent through texts. The number of
participants are 501.
•
ABS claims that the proportion of students who got texts was significantly greater
in group 2 Use a 5% level of significance to test the claim that the proportion of
students that participates is greater in group 2, the group that received texts.
Group Work
• Sample of 1100 voters was randomly divided into two
groups.
• Group 1: 500 voters; no reminders sent; 200 voted
• Group 2: 600 voters; reminders sent; 330 vote
• Do the data support the claim that the proportion of
voters who registered was greater in group that
received reminders than in the group that did not? Use
a 1% level of significance.
TI 83/TI 84
• Press STAT and selects TESTS. You either use
2-SampZTest, 2-SampTTest, or 2-PropZTest.
Homework Practice
• Pg 470 #1-20 odd
Download