S2 - Chapter 7 Hypothesis Testing

advertisement
S2 Chapter 7: Hypothesis Testing
Dr J Frost (jfrost@tiffin.kingston.sch.uk)
Last modified: 3rd October 2014
What is Hypothesis Testing?
Note: This first lesson will be mostly
note-taking, so pay attention!
To get a flavour of hypothesis testing, discuss how you would approach the following problem:
In 2013 in Richmond park, whenever I went on an hour long stroll, I saw on average
10 squirrels. I want to establish whether now in 2014, the rate of squirrels I see has
increased. I need to ensure any result I get is statistically significant.
1. Go on a stroll one day in 2014 and
count the number of squirrels I see.
Suppose I saw 15 squirrels.
2. If I were to assume that the rate at
which I see squirrels hasn’t
changed, I would calculate the
probability that I would see 15
squirrels or more (using a Poisson
Distribution).
3. If this probability of seeing at least
15 squirrels by chance is sufficiently
low (say less than 5%), I conclude
that the rate at which squirrels
appear has increased.
! A hypothesis is a statement made about the value of
a population parameter that we wish to test by
collecting evidence in the form of a sample.
(In this case, the population parameter is the rate 𝜆 I see squirrels in an
hour. There are two possible hypotheses:)
! The null hypothesis, 𝐻0 is the default position, i.e.
that nothing has changed, unless proven otherwise.
(In this case, it’s that the rate of seeing squirrels hasn’t changed,
and seeing 15 was just a fluke.)
! The alternative hypothesis, 𝐻1 , is that there has
been some change in the population parameter.
(i.e. Seeing 15 squirrels wasn’t a fluke – it was because the
average rate of squirrels had actually increased.)
! In a hypothesis test, the evidence from the sample
is a test statistic.
(In this case, we’ve taken a sample by counting squirrels, and found the
test statistic of “rate of squirrels across 1 hour was 15”)
! The level of significance 𝜶 is the maximum probability where we would reject the null hypothesis.
This is usually 5% or 1%.
What is Hypothesis Testing?
Hypothesis testing in a nutshell* then is:
1. We have some hypothesis we wish to see if true (average rate of
squirrels seen has increased), so…
2. We collect some sample data (giving us our test statistic) and…
3. If that data is sufficiently unlikely to have emerged ‘just by chance’,
then we conclude that our (alternate) hypothesis is correct.
* Squirrel pun intended.
!
Null Hypothesis and Alternative Hypothesis
John wants to see whether a coin is unbiased or whether it is biased towards coming
down heads. He tosses the coin 8 times and counts the number of times 𝑋, it lands
head uppermost. What values would lead to John’s hypothesis being rejected?
We said that our two hypotheses are about the population parameter.
Suppose 𝑝 is the probability of a coin landing heads.
Null hypothesis:
Alternative hypothesis:
𝐻0 : 𝑝 = 0.5 ?
𝐻1 : 𝑝 > 0.5
?
Critical Regions and Values
John wants to see whether a coin is unbiased or whether it is biased towards coming
down heads. He tosses the coin 8 times and counts the number of times 𝑋, it lands
head uppermost. What values would lead to John’s hypothesis being rejected?
As before, we’re interested how likely a given outcome is likely to happen ‘just by chance’
under the null hypothesis (i.e. when the coin is not biased).
The probability of getting exactly 5 heads is only 22%,
which is more likely to not happen than to happen. If
we saw this number of heads, why would it not be
sensible to think the coin is biased?
The probability is only low because there’s lots of
possible outcomes. But 5 heads forms part of a range
of possible number of heads that collectively would
be consistent with a coin not biased towards heads.
Prob under 𝐻0
However, there are values which
collectively form a range of
‘extreme values’ where it would
be unlikely that the coin would
be unbiased. Their combined
probability is limited by the level
of significance 𝛼
0
1
2
3 4
?
5
Num heads
6
7
8
Critical Regions and Values
John wants to see whether a coin is unbiased or whether it is biased towards coming
down heads. He tosses the coin 8 times and counts the number of times 𝑋, it lands
head uppermost. What values would lead to John’s hypothesis being rejected?
What’s the probability that we would see 6
heads, or an even more extreme value? Is this
sufficiently unlikely to support John’s claim that
the coin is biased?
𝑷 𝑿≥𝟔 =𝟏−𝑷 𝑿≤𝟓
= 𝟎. 𝟏𝟒𝟒𝟓
Insufficient evidence to reject null hypothesis.
?
What’s the probability that we would see 7
heads, or an even more extreme value?
𝑷 𝑿≥𝟕 =𝟏−𝑷 𝑿≤𝟔
= 𝟎. 𝟎𝟑𝟓𝟐
Since 0.0352 < 0.05, this is very unlikely, so we
reject the null hypothesis and accept the
alternative hypothesis that the coin is biased.
?
C.D.F. Binomial table:
𝑝 = 0.5, 𝑛 = 8
𝑥
𝑃(𝑋 ≤ 𝑥)
0
0.0039
1
0.0352
2
0.1445
3
0.3633
4
0.6367
5
0.8555
6
0.9648
7
0.9961
Critical Regions and Values
John wants to see whether a coin is unbiased or whether it is biased towards coming
down heads. He tosses the coin 8 times and counts the number of times 𝑋, it lands
head uppermost. What values would lead to John’s hypothesis being rejected?
! The critical region is the range of values
of the test statistic that would lead to you
rejecting 𝐻0
C.D.F. Binomial table:
𝑝 = 0.5, 𝑛 = 8
𝑥
If 𝛼 = 0.05, critical region?
We saw that 95% is exceeded when 𝑿 = 𝟔. This
means 𝑷 𝑿 ≥ 𝟕 < 𝟓% Bro Tip: Use the first value
𝑿≥𝟕
AFTER the one in the table
?
that exceeds 1 − 𝛼
! The value(s) on the boundary of the
critical region are called critical value(s).
Critical value:
𝑃(𝑋 ≤ 𝑥)
0
0.0039
1
0.0352
2
0.1445
3
0.3633
4
0.6367
5
0.8555
6
0.9648
7
0.9961
7
?
We’ll explore more fully critical values and regions later on…
Quickfire Critical Regions
Determine the critical region when we throw a coin where we’re trying to establish if
there’s the specified bias, given the specified number of throws, when 𝛼 = 0.05
Coin thrown 5 times.
Trying to establish if
biased towards
heads.
Coin thrown 10
times. Trying to
establish if biased
towards heads.
Coin thrown 10
times. Trying to
establish if biased
towards tails.
𝑝 = 0.5, 𝑛 = 10
𝑝 = 0.5, 𝑛 = 10
𝑝 = 0.5, 𝑛 = 5
𝑥
𝑃(𝑋 ≤ 𝑥)
𝑥
𝑃(𝑋 ≤ 𝑥)
𝑥
𝑃(𝑋 ≤ 𝑥)
0
0.0010
0
0.0010
1
0.0107
1
0.0107
0
0.0312
1
0.1875
2
0.0547
2
0.0547
2
0.5000
…
…
…
…
3
0.8125
7
0.9453
7
0.9453
8
0.9893
8
0.9893
9
0.9990
9
0.9990
4
0.9688
Critical region:
𝑋≥5
?
Critical region:
𝑋≥9
?
Critical region:
𝑋≤1
?
Note that if the
‘tail’ is to the left
we just use the
first value that
goes under 𝛼
One and Two-Tailed Tests
Our alternative hypothesis is stating something about the population parameter.
In the example of the coin, it’s that the coin is biased towards heads, i.e. that 𝑝 > 0.5.
What would our alternative hypothesis be however if we were interested in the coin just
being biased either way?
𝒑 ≠ 𝟎. 𝟓 as the coin could be biased to either heads (𝒑 > 𝟎. 𝟓) or to tails (𝒑 < 𝟎. 𝟓).
?
! One tailed test: Looking for an increase in
parameter 𝜃 or decrease in it. If null
hypothesis is 𝜃 = 𝑚 (for some value 𝑚),
either 𝐻1 : 𝜃 > 𝑚 or 𝐻1 : 𝜃 < 𝑚
! Single critical region and one critical value.
! Two tailed test: 𝐇𝟏 : 𝜽 ≠ 𝒎
i.e. could be change in either direction.
! Two parts to critical region and two critical values.
𝟏
! If 5% significance, convention is to split so 𝟐 %
𝟐
either tail.
Our critical
region
Prob under 𝐻0
Prob under 𝐻0
This is the distribution of 𝑋 when we
assume the null hypothesis (coin is fair)
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
Num heads
Num heads
Structure of Hypothesis Tests
! Summary of Hypothesis Test:
1. Identify population parameter 𝜃 (use proportion 𝑝 in Binomial or mean rate 𝜆 for
a Poisson) you are going to test.
2. Write 𝐻0 and 𝐻1 . Consider whether test is one or two tailed.
3. Specify significance level 𝛼 (usually 0.05 or 0.01).
4. Find out whether observed value 𝑥 of your test statistic falls in critical region.
One tail:
𝐻0 : 𝜃 = 𝑚 𝐻1 : 𝜃 > 𝑚 Reject 𝐻0 if 𝑃 𝑋 ≥ 𝑥 ≤ 𝛼 or…
𝐻0 : 𝜃 = 𝑚 𝐻1 : 𝜃 < 𝑚 Reject 𝐻0 if 𝑃 𝑋 ≤ 𝑥 ≤ 𝛼
Two tail:
1
1
𝐻0 : 𝜃 = 𝑚 𝐻1 : 𝜃 ≠ 𝑚 Reject if 𝑃 𝑋 ≥ 𝑥 ≤ 2 𝛼 or 𝑃 𝑋 ≤ 𝑥 ≤ 2 𝛼
5. State conclusion. Address:
a) Is result significant or not?
b) What are implications in terms of context of original problem?
Example
Accidents used to occur at a road junction at a rate of 6 per month. After a speed limit
is placed on the road the number of accidents in the following month is 2. The
planners wish to test, at the 5% level of significance, whether or not there has been a
decrease in the rate of accidents.
a) Suggest a suitable test statistic.
The number of accidents in a month.?
b) Write down the null and alternative hypothesis.
𝑯𝟎 : 𝝀 = 𝟔
𝑯𝟏 : 𝝀 < 𝟔
?
(Test your understanding:
𝜆 is the population parameter)
c) Explain the conditions under which the null hypothesis is rejected.
If the probability of getting 2 or fewer accidents ≤ 𝟓%, the null
?
hypothesis is rejected.
?
Exercise 7A
3
Shell wants to test to see whether a coin
is biased. She tosses the coin 100 times
and counts the number of times she gets
a head.
a) Describe the test statistic.
Counts of heads.
?
b) Write down null and alternative
hypotheses.
𝑯𝟎 : 𝒑 = 𝟎. 𝟓
𝑯𝟏 :?𝒑 ≠ 𝟎. 𝟓
5
In a survey it was found that 4 out of 10
people supported a certain particular
political party. Chang wishes to test
whether or not there has been a change
in the proportion (𝑝) of people
supporting the party. Suggest suitable
hypotheses.
𝑯𝟎 : 𝒑 = 𝟎. 𝟒
𝑯𝟏 : 𝒑?≠ 𝟎. 𝟒
7
A spinner has 4 sides numbered
1, 2, 3, 4. Hajdra thinks it is
biased to give a one when spun.
She spins 5 times and counts the
number of times, 𝑀, that she
gets a 1.
a) Describe the test statistic 𝑀.
The number of times she
?
gets a 1.
b) She decides to do a test with
a level of significance of 5%.
What values of 𝑀 would
cause the null hypothesis to
be rejected?
𝑷 𝑴 ≥ 𝟒 = 𝟏. 𝟔% < 𝟓%,
so 𝑴 is 4 or 5. ?
Carrying out Hypothesis Tests for Poisson/Binomial
Q The standard treatment for a particular disease has a 2 probability of success. A
5
certain doctor has undertaken research in this area and has produced a new drug
which has been successful with 11 out of 20 patients. The doctor claims that the
new drug represents an improvement on the standard treatment. With a
significance level of 5%, determine whether the doctor’s claim is correct.
From earlier:
1. Identify population parameter 𝜃 (use proportion
𝑝 in Binomial or mean rate 𝜆 for a Poisson) you
are going to test.
2. Write 𝐻0 and 𝐻1 . Consider whether test is one
or two tailed.
3. Find out whether observed value 𝑥 of your test
statistic falls in critical region.
4. State conclusion. Address:
a) Is result significant or not?
b) What are implications in terms of context
of original problem?
𝟐
𝑯𝟎 : 𝒑 =
𝟓
𝟐
𝑯𝟏 : 𝒑 >
𝟓
?
𝟐
𝑿~𝑩 𝟐𝟎,
𝟓
?
𝑷 𝑿 ≥ 𝟏𝟏 = 𝟏 − 𝑷 𝑿 ≤ 𝟏𝟎
= 𝟏𝟐. 𝟕𝟓%
𝟏𝟐. 𝟕𝟓% > 𝟓% so insufficient
? 𝑯𝟎.
evidence to reject
New drug is no better than old.
Test Your Understanding
Q1 Over a long period of time it has been found that in Enrico’s restaurant the ratio of
non-veg to veg meals is 2 to 1. In Manuel’s restaurant in a random sample of 10
people ordering meals, 1 ordered a vegetarian meal. Using a 5% level of
significance, test whether or not the proportion of people eating veg meals in
Manuel’s restaurant is different to that in Enrico’s restaurant.
𝟏
Proportion eating veg meals at Enrico’s is 𝟑
Let 𝒑 be the proportion of people at Manuel’s that order veg.
Let 𝑿 be number of people eating veg meals.
𝟏
𝑯𝟎 : 𝒑 =
𝟑
𝟏
If 𝑯𝟎 true then 𝑿~𝑩 𝟏𝟎, 𝟑
𝟏
𝑯𝟏 : 𝒑 ≠
𝟑
?
𝑷 𝑿≤𝟏 =𝑷 𝑿=𝟎 +𝑷 𝑿 =𝟏
= 𝟎. 𝟏𝟎𝟒 (𝟑𝒔𝒇)
𝟎. 𝟏𝟎𝟒 > 𝟎. 𝟎𝟐𝟓 therefore insufficient evidence to reject 𝑯𝟎 .
There is no evidence that proportion of veg meals at Manuel’s
restaurant is different to Enrico’s.
Half significance
as 2 tailed.
Conclusion and
what it means in
context.
Test Your Understanding
Q2 Accidents used to occur at a certain road
junction at a rate of 6 per month. The
residents petitioned for traffic lights. In
the month after the lights were installed
there was only 1 accident. Does this give
sufficient evidence that the lights have
reduced the number of accidents? Use a
5% level of significance.
Let 𝑿 be number of accidents this
month
𝑯𝟎 : 𝝀 = 𝟔,
𝑯𝟏 : 𝝀 < 𝟔
𝑿~𝑷𝒐 𝟔 under 𝑯𝟎 .
𝑷 𝑿 ≤ 𝟏 𝝀 =?
𝟔 = 𝟎. 𝟎𝟏𝟕𝟒
𝟎. 𝟎𝟏𝟕𝟒 < 𝟎. 𝟎𝟓
There is sufficient evidence to reject
𝑯𝟎 . Lights may have reduced number
of accidents.
Q3 Over a long period of time, Jessie found
that the bus taking her to school was
late at the rate of 6.7 times per month.
In the month following the start of the
new summer bus schedules, Jessie
finds that her bus is late twice.
Assuming that the number of times the
bus is late has a Poisson distribution,
test at the 1% level of significance,
whether or not the new schedules have
in fact decreased the number of times
the bus is late.
Let 𝑿 be the num times late. 𝑿~𝑷𝒐 𝝀
𝑯𝟎 : 𝝀 = 𝟔. 𝟕 𝑯𝟏 : 𝝀 < 𝟔. 𝟕
𝑷 𝑿 ≤ 𝟐 𝝀 = 𝟔. 𝟕 =
𝑷 𝑿= 𝟎 +𝑷 𝑿=𝟏 +𝑷 𝑿=𝟐
= 𝟎. 𝟎𝟑𝟕𝟏 𝟑𝒔𝒇
𝟎. 𝟎𝟑𝟕𝟏 > 𝟎. 𝟎𝟏 so insufficient evidence to
reject 𝑯𝟎 . New schedule has not decreased
number of times bus is late.
?
(Note: Poisson tables can be used for Q2 but not for Q3)
Exercise 7B
1
Carry out the test where 𝑋 represents the number of
successes.
𝐻0 : 𝑝 = 0.25, 𝐻1 : 𝑝 > 0.25; 𝑛 = 10, 𝑥 = 5, 𝛼 = 5%
0.0781 > 0.05. Insufficient evidence to reject 𝑯𝟎 .
?
9
Carry out the test using Poisson distribution.
𝐻0 : 𝜆 = 6.5, 𝐻1 : 𝜆 < 6.5; 𝑥 = 2, 𝛼 = 1%
0.0430 > 0.01. Insufficient evidence to reject 𝑯𝟎 .
?
11
The manufacturer of ‘Supergold’ margarine claims that
people prefer this to butter. As part of an advertising
campaign he asked 5 people to taste a sample of ‘Supergold’
and a sample of butter and say which they prefer. Four people
chose ‘Supergold’. Assess the manufacturer’s claim in the light
of this evidence. Use a 5% level of significance.
0.1875 > 0.05. Insufficient evidence to reject 𝑯𝟎 . Insufficient
evidence to suggest that people prefer Supergold to butter.
?
13
A die used in playing a board game is suspected of not giving
the number 6 often enough. During a particular game it was
rolled 12 times and only one 6 appeared. Does this represent
significant evidence, at the 5% leel of significance that the
1
probability of a 6 on this die is less than ?
6
0.3815 > 0.05. Insufficient evidence to reject 𝑯𝟎 . No
evidence that die is biased.
?
15
Every year a stats teacher takes his
class out to observe the traffic
passing the school gates during a
Tuesday lunch hour. Over the years
she has established that the average
number of lorries passing the gates in
a lunch hour. Over the years she has
established that the average number
of lorries passing the gates in a lunch
hour is 7.5. During the last 12 months
a new bypass has been built and the
number of lorries passing the school
gates in this year’s experiment was 4.
Test, at the 5% level of significance,
whether or not the mean number of
lorries passing the gates during a
Tuesday lunch hour has been
reduced.
0.1321 > 0.05
Insufficient evidence to reject 𝑯𝟎 .
No evidence of a decrease in the
number of lorries passing the gates
in a lunch hour.
?
Critical Regions with One and Two Tailed Tests
There are two options for carrying out hypothesis tests:
• Calculate the probability of the outcome “or worse” and see if this is lower than the
significance level (as we just did).
• Calculate the critical regions and see if the test statistic lies within it.
We earlier touched upon finding critical value(s), i.e. those at which (and beyond which) we
reject the null hypothesis.
John wants to see whether a coin is biased. He tosses the coin 8 times and counts the
number of times 𝑋, it lands head uppermost. Determine, to a 5% level of significance,
the critical region.
C.D.F. Binomial table:
𝑝 = 0.5, 𝑛 = 8
𝑃(𝑋 ≤ 𝑥)
0
0.0039
1
0.0352
2
0.1445
3
0.3633
4
0.6367
5
0.8555
6
0.9648
7
0.9961
Prob under 𝐻0
𝑥
But recall the earlier tip: Use
the value AFTER the first one
that exceed 1 − 𝛼
𝑿~𝑩 𝟖, 𝟎. 𝟓
𝑷 𝑿 ≤ 𝟎 = 𝟎. 𝟎𝟎𝟑𝟗 ≤ 𝟎. 𝟎𝟐𝟓
For right-tail, if in doubt, try a few
values around 0.975.
𝑷 𝑿≥𝟕 =𝟏−𝑷 𝑿≤𝟔
= 𝟎. 𝟎𝟑𝟓𝟐 > 𝟎. 𝟎𝟐𝟓
𝑷 𝑿≥𝟖 =𝟏−𝑷 𝑿≤𝟕
= 𝟎. 𝟎𝟎𝟑𝟗 < 𝟎. 𝟎𝟐𝟓
So critical region is:
8
𝑿 = 𝟎 𝒐𝒓 𝑿 = 𝟖
?
0 1 2 3 4 5 6 7
Num heads
Quickfire Critical Regions
Determine the critical region when we throw a coin where we’re trying to establish if
there’s the specified bias, given the specified number of throws, when 𝛼 = 0.05
Coin thrown 5 times.
Trying to establish if
biased towards
heads.
Coin thrown 10
times. Trying to
establish if biased
towards heads.
Coin thrown 10
times. Trying to
establish if biased
towards tails.
𝑝 = 0.5, 𝑛 = 10
𝑝 = 0.5, 𝑛 = 10
𝑝 = 0.5, 𝑛 = 5
𝑥
𝑃(𝑋 ≤ 𝑥)
𝑥
𝑃(𝑋 ≤ 𝑥)
𝑥
𝑃(𝑋 ≤ 𝑥)
0
0.0010
0
0.0010
1
0.0107
1
0.0107
0
0.0312
1
0.1875
2
0.0547
2
0.0547
2
0.5000
…
…
…
…
3
0.8125
7
0.9453
7
0.9453
4
0.9688
8
0.9893
8
0.9893
9
0.9990
9
0.9990
Critical region:
𝑋≥5
?
Critical region:
𝑋≥9
?
Critical region:
𝑋≤1
?
Actual Level of Significance
We have seen a level of significance 𝛼 that the probability of being a particular value
“or more extreme” needs to be below.
𝛼 = 0.025
0.015
0.013
𝑋=0
𝑋=1
Now suppose we instead opted on a policy of
choosing the closest to 2.5% rather than the
first under it…
𝛼 = 0.975
0.3
0.4
0.252
0.02
𝑋=2
𝑋=3
𝑋=4
𝑋=5
Although we exceeded 2.5% at the left tail,
because there was some ‘spare’ here, the actual
level of significance is 1.5% + 1.3% + 2% = 4.8%,
which is within the level of significance.
This is because our values are discrete and
therefore are unlikely to be able to get ‘exactly’
5%. At each tail we had to be within 2.5%.
In an exam, they will say “find the value as close to
0.025 as possible.”
! The actual level of significance of the test is the probability of rejecting 𝐻0 .
So here, the level of significance is 5%, but the actual level of significance is:
1.5% + 2% = 𝟑. 𝟓%
?
Quickfire Critical Regions (Closest Value)
Only if specifically instructed to find the closest to 2.5% for each tail:
• For the left tail, just find the closest value in the table.
• Find the right tail, use the value just AFTER the closest one.
𝑝 = 0.3, 𝑛 = 9
𝑝 = 0.35, 𝑛 = 10
𝑥
𝑃(𝑋 ≤ 𝑥)
𝑥
𝑃(𝑋 ≤ 𝑥)
0
0.0135
0
0.0404
1
0.0860
1
0.1960
2
0.2616
2
0.4628
…
…
…
…
6
0.9740
5
0.9747
7
0.9952
6
0.9957
8
0.9995
7
0.9996
9
1.0000
8
1.0000
Critical region:
𝑋 = 0 𝑜𝑟 𝑋 ≥ 7
?
(whereas before it would
have been 𝑋 ≥ 8)
Critical region:
𝑋 ≤ 1 𝑜𝑟 𝑋 ≥ 6
?
(whereas before it would
have been 𝑋 ≥ 7)
Test Your Understanding
May 2013 Q6
𝑿~𝑩 𝟐𝟎, 𝟎. 𝟐𝟓
𝑷 𝑿 ≤ 𝟏 = 𝟎. 𝟎𝟐𝟒𝟑 ≤ 𝟎. 𝟎𝟐𝟓
𝑷 𝑿 ≥ 𝟏𝟎 = 𝟏 − 𝑷 𝑿 ≤ 𝟗 = 𝟎. 𝟎𝟏𝟑𝟗
Critical region is 𝟎 ≤ 𝑿 ≤ 𝟏
𝒂𝒏𝒅 𝟏𝟎 ≤ 𝑿 ≤ 𝟐𝟎
M1
A1
A1
A1 A1
?
As per tip, use the value in
your table AFTER the first to
exceed 0.975.
The 0 ≤ and ≤ 20 are optional because
they are implied by the question.
One More Example
Q
An office finds that over a long time incoming telephone calls from customers occur
at a rate of 0.325 per minute. They believe that the number of calls has increased
recently. To test this, the number of incoming calls during a random 20-minute
interval is recorded.
Find the critical region for a two-tailed test of the hypothesis that the number of
incoming calls occur at the rate of 0.325 per minute. The probability in both tails
should be as close to 2.5% as possible.
𝜆 = 6.5
𝑿~𝑷𝒐 𝟔. 𝟓
𝑯𝟎 : 𝝀 = 𝟔. 𝟓 𝑯𝟏 : 𝝀 ≠ 𝟔. 𝟓
𝑷 𝑿 ≤ 𝟏 = 𝟎. 𝟎𝟏𝟏𝟑?(close than 0.0430)
𝑷 𝑿 ≥ 𝟏𝟐 = 𝟏 − 𝑷 𝑿 ≤ 𝟏𝟏 = 𝟎. 𝟎𝟑𝟑𝟗
So critical region 𝑿 ≤ 𝟏 and 𝑿 ≥ 𝟏𝟐
𝑥
𝑃(𝑋 ≤ 𝑥)
0
0.0015
1
0.0113
2
0.0430
…
…
11
0.9661
12
0.9840
13
0.9929
14
0.9970
Using Approximating Distributions
Recall that we often approximate Poisson and Binomial Distributions as Normal
Distributions (particularly as we can’t use tables for large 𝑛 or 𝜆!)
We can also approximate a Binomial as a Poisson if 𝑛𝑝 < 10 (refer to your flow chart)
Q1
A shop sells grass mowers at the rate of 10 per week. In an attempt to increase
sales, the price was reduced for a six-week period. During this period a total of 75
mowers were sold.
Using a 5% level of significance, test whether or not there is evidence that the
average number of sales per week has increased during this six-week period.
𝑯𝟎 : 𝝀 = 𝟏𝟎,
𝑯𝟏 : 𝝀 > 𝟏𝟎
𝒀~𝑷𝒐 𝟔𝟎
𝑷 𝒀 ≥ 𝟕𝟓 ≈ 𝑷 𝑾 > 𝟕𝟒. 𝟓 where 𝑾~𝑵 𝟔𝟎, 𝟔𝟎
𝟕𝟒. 𝟓 − 𝟔𝟎
? = 𝒁 > 𝟏. 𝟖𝟕 = 𝟎. 𝟎𝟑𝟎𝟕
≈𝑷 𝒁>
𝟔𝟎
𝟎. 𝟎𝟑𝟎𝟕 < 𝟎. 𝟎𝟓 therefore reject 𝑯𝟎 .
There is evidence that the sales per week have increased.
Bro Tip: Don’t forget
your continuity
correction! (recall:
you make the range
0.5 wider)
Identifying Critical Regions
Q2
During an influenza epidemic, 4% of the population of a large city was affected on a
given day. The manager of a factory that employs 100 people found that 12 of his
employees were absent, claiming to have influenza.
Using a 5% level of significance, find the critical region that would enable the
manage to test whether there is evidence that percentage of people having
influenza at his factory was greater than that of the city, and state your conclusion.
𝑯𝟎 : 𝒑 = 𝟎. 𝟎𝟒
𝑯𝟏 : 𝒑 > 𝟎. 𝟎𝟒
𝑿~𝑩 𝟏𝟎𝟎, 𝟎. 𝟎𝟒
𝒏𝒑 = 𝟒 < 𝟏𝟎 (and 𝒑 is not close to 0.5) so use
Poisson approximation.
So 𝑿 ≈ 𝑷𝒐 𝟒
Using Poisson tables:
𝑷 𝑿 ≥ 𝟗 = 𝟏 − 𝑷 𝑿 ≤ ?𝟖 = 𝟎. 𝟎𝟐𝟏𝟒
So critical region 𝑿 ≥ 𝟗
Since 𝟏𝟐 > 𝟗 the manager concludes the
percentage of people having influenza is larger
than the city.
𝜆=4
𝑥
𝑃(𝑋 ≤ 𝑥)
7
0.9489
8
0.9786
9
0.9919
Test Your Understanding
May 2013 (R) Q6
Notice this is the usual
stricter “less than 𝛼”
rather than “closest to 𝛼”.
?
?
?
Exercise 7C
Assuming a binomial distribution, find critical
region for the test statistic 𝑋.
Q1
Q3
Q5
Q7
𝐻0 : 𝑝 = 0.20, 𝐻1 : 𝑝 > 0.20; 𝑛 = 10, 𝛼 = 5%
𝑿≥𝟓
𝐻0 : 𝑝 = 0.40, 𝐻1 : 𝑝 ≠ 0.40; 𝑛 = 20, 𝛼 = 5%
𝑿 ≤ 𝟑 𝒂𝒏𝒅 𝑿 ≥ 𝟏𝟑
𝐻0 : 𝑝 = 0.63, 𝐻1 : 𝑝 > 0.63; 𝑛 = 10, 𝛼 = 5%
𝑿 = 𝟏𝟎
?
?
?
Find critical region given 𝑋~𝑃𝑜 𝜆
𝐻0 : 𝜆 = 4, 𝐻1 : 𝜆 > 4; 𝛼 = 5%
𝑿≥𝟗
?
Q10
Q13
A seed merchant usually kept her stock in
carefully monitored conditions. After the
Christmas holidays one year she discovered that
the monitoring system had broken down and
there was a danger that the seed might have
been damaged by frost. She decided to check a
sample of 10 seeds to see if the proportion 𝑝 that
germinates had been reduced from the usual
value of 0.85. Find the critical region for a onetailed test using a 5% level of significance.
𝑿≤𝟔
?
An estate agent usually sells properties at
the rate of 10 per week. During a recession,
when money was less available, over an
eight-week period he sold 55 properties.
Using a suitable approximation test, at the
5% level of significance, whether or not
there is evidence that the weekly rate of
sales decreased.
There is evidence that the rate of weekly
sales has decreased.
?
Q14
A manager thinks that 20% of his workforce
are absent for at least one day each month.
He chooses 100 workers at random and finds
that in the last month 2 had been absent for
at least one day.
Using a suitable approximation test, at the
5% level of significance, whether or not
there is evidence the weekly rate of sales
decreased.
Reject 𝑯𝟎 . There is evidence that the
percentage of workers who are absent for at
least 1 day per month is less than 20%.
?
If you’re done, continue onto
mixed exercises (7D).
Download