Part VIII Tests of Significance

advertisement
Chapter26
Chapter26
Examples
Examples
Denitions
Denitions
Hypotheses
Hypotheses
Test
Statistics
P -values
More
Examples
Part VIII
Tests of Signicance
Test
Statistics
P -values
More
Examples
Examples from the news
Examples
Denitions
Hypotheses
Test
Statistics
P -values
More
Examples
A test of signicance deals with the question whether an
observation can be reasonably explained by chance or not.
These tests are widely used, so it is important to understand
how they work.
Chapter 26
Tests of Signicance
Chapter26
Context
New terminology: null hypothesis, alternative hypothesis,
test-statistic, P -value
Chapter26
Example 1
Examples
Men get greater satisfaction than women from seeing someone
they dislike suer pain shows a study of how people react when
witnessing revenge. Scientists found highly signicant
dierences between the genders in how male and the female
brains respond.
The director of British Swimming says: 'Our objective is to
achieve signicantly more in London [in 2012] than Bejing'.
Further, he says that money could 'make a very signicant
dierence' to performance.
Exposure to asbestos bres in homes and other buildings where
asbestos materials are present and in good condition is not
normally signicantly dierent to that from background
exposure and is therefore not a cause for concern.
Denitions
Hypotheses
Test
Statistics
P -values
More
Examples
A senator introduces bill that simplies the tax code. He claims
that the bill is 'revenue neutral'. This means that the total tax
revenues will stay the same. To check the claim, the Treasury
Department uses a computer le of 100,000 representative tax
returns. For a simple random sample of 100 returns, it
calculates
change = tax under new rules − tax under old rules
The sample average comes out as a change of -$219, with an
SD of $725.
Example 1
Chapter26
Chapter26
Examples
Examples
Denitions
Denitions
Hypotheses
Hypotheses
Test
Statistics
Test
Statistics
P -values
More
Examples
There are two dierent explanations for the average change of
-$219.
(i) The senator is right, and the bill is indeed revenue neutral.
In other words, the true average change among the
100,000 returns is zero, and the observed average change
of -$219 (from the sample of 100) is due to chance error.
(ii) The senator is wrong, and the bill is not revenue neutral.
The dierence is real, and tax revenues will go down under
the new bill.
Example 1
Chapter26
P -values
More
Examples
Under assumption (i), the average change is like the average of
100 draws from the box.
Chapter26
Examples
Denitions
Denitions
Hypotheses
Hypotheses
Test
Statistics
Test
Statistics
More
Examples
Expected value for the average:
SE for the sum:
√
$0
100 × $725 = $7, 250
$7, 250
SE for the average:
= $72.50
100
The observed average is -$219, or
(−$219) − ($0)
≈ −3
$72.50
in standard units. The probability histogram for the average of
the draws is very close to the normal curve.
The idea of a test of signicance is the following. Assume that
(i) is true. Then the observed dierence of -$219 is due to
chance variability. We can set up a box model:
100,000 tickets, each showing
the change for that taxpayer.
Average of box: $0
SD of box: $725
Examples
P -values
Example 1
Example 1
P -values
More
Examples
The area under the normal curve between -3 and 3 is about
99.7%. The chance in our example is equal to the area under
the curve to the left of -3. That area is approximately 0.15%.
So if explanation (i) is true, there is only one chance in 1,000 to
get an average of -$219 or below. This is strong evidence
against explanation (i). If we stick to it, we need a small
miracle to explain the data.
Chapter26
The Null and the Alternative
Examples
Examples
Denitions
Denitions
Hypotheses
Test
Statistics
P -values
More
Examples
Hypotheses
Denition
In a test of signicance, the null hypothesis expresses the idea
that an observed dierence is due to chance. To make a test,
the null hypothesis has to be set up as a box model for the data.
The alternative hypothesis is another statement about the box;
it says that the dierence is real.
In Example 2,
ˆ null hypothesis - explanation (i); the average of the box
equals $0
ˆ alternative hypothesis - explanation (ii); the average of the
box is less than $0
Test statistics
Chapter26
Test
Statistics
P -values
More
Examples
This is an example of a test statistic.
Denition
A test statistic is used to measure the dierence between the
data and what is expected in the null hypothesis.
Chapter26
Examples
Denitions
Denitions
Hypotheses
P -values
More
Examples
In the example, we computed how many SEs away the observed
value of the sample average was from its expected value:
(−$219) − ($0)
≈ −3
$72.50
Examples
Test
Statistics
Test statistics
Chapter26
P -values
Hypotheses
The test statistics used in our example is usually called z :
z=
observed − expected
SE
and tests using the z -statistic are called z -tests.
(There are very many other tests).
Denition
z says how many SEs away an observed value is from the
expected value, where the expected value is calculated based on
the null hypothesis.
Test
Statistics
P -values
More
Examples
The z -statistic was -3, and the area to the left of -3 under the
1
normal curve is about 0.15% or 1000
. This is called an observed
signicance level and often denoted by P . We also refer to this
number as a P -value.
Denition
The observed signicance level is the chance of getting a test
statistic as extreme as, or more extreme than the observed one.
The chance is calculated on the basis that the null hypothesis is
true. The smaller this chance is, the stronger the evidence
against the null hypothesis.
Chapter26
P -values
Examples
Examples
Denitions
Denitions
Hypotheses
Hypotheses
Test
Statistics
P -values
More
Examples
Chapter26
How small should the observed signicance level be before we
reject the null hypothesis? Many scientists draw the line at 5%
or 1%:
ˆ If P is less than 5%, the result is called statistically
signicant
ˆ If P is less than 1%, the result is called highly signicant
This formalizes our basic idea. When the data are too far from
the predictions of a theory, that means trouble. We better
reject the null hypothesis if the observed value is too many SEs
away from the expected value.
Examples from the news
Examples
Denitions
Hypotheses
Test
Statistics
P -values
More
Examples
Important
Chapter26
Test
Statistics
P -values
More
Examples
A P -value larger than 5% does not prove that the null
hypothesis is true. It just means that there is not enough
evidence in the data to reject the null hypothesis.
It gives an indication for, that the null hypothesis might be
true, but it cannot be known for sure.
Setting up a test
Chapter26
Examples
Men get greater satisfaction than women from seeing someone
they dislike suer pain shows a study of how people react when
witnessing revenge. Scientists found highly signicant
dierences between the genders in how male and the female
brains respond.
The director of British Swimming says: 'Our objective is to
achieve signicantly more in London [in 2012] than Bejing'.
Further, he says that money could 'make a very signicant
dierence' to performance.
Exposure to asbestos bres in homes and other buildings where
asbestos materials are present and in good condition is not
normally signicantly dierent to that from background
exposure and is therefore not a cause for concern.
Denitions
Hypotheses
Test
Statistics
P -values
More
Examples
(1)
(2)
(3)
(4)
Dene null hypothesis and alternative hypothesis
Pick a test statistic
Compute the observed signicance level (= P -value)
Make a decision
Chapter26
Example 2
Chapter26
Examples
Examples
Denitions
Denitions
Hypotheses
Test
Statistics
P -values
More
Examples
A biologist tells you that osprings of a new plant have
75% chance to have blue
owers, and 25% chance to
have white owers.
Example 3: Interpreting the P -value
Hypotheses
Test
Statistics
P -values
More
Examples
200 seeds were raised, and 142
had blue owers and 58 had
white owers.
Which interpretation(s) is/are correct:
1 P -value = chance that the null hypothesis is true
2 P -value = chance that null hypothesis is true, given that
we saw the value of the test statistic.
3 P -value = chance of getting a test statistic as extreme as
or more extreme than the one we saw, given that the null
hypothesis is true
Do you believe the biologist?
Chapter26
Examples
Denitions
Hypotheses
Test
Statistics
P -values
More
Examples
Example 4
Two investigators are conducting
a test on a box X:
Null hypothesis: avg of box = 50
Alt. hypothesis: avg of box < 50
Both use a z -test. The rst
investigator takes 100 tickets at
random from the box, the other
does the same for 900 tickets.
Both get SD of 10.
True or false: the investigator
whose average is further away from
50 will get the smaller P -value.
Chapter26
Example 5
Examples
Denitions
Hypotheses
Test
Statistics
P -values
More
Examples
A large course has 900 students
broken into sections of 30 students. On the nal, the average is 63, and the SD is 20. In
one section, however, the average is 55. The TA for that section claims this is due to chance
error. Is this a good defence?
Summary
Chapter26
Examples
Denitions
Hypotheses
Test
Statistics
P -values
More
Examples
Chapter26
Interpreting the P -value
Examples
A test of signicance deals with the question whether an
observation can be reasonably explained by chance or not.
Denitions
The null hypothesis expresses the idea that an observed
dierence is due to chance, while the alternative hypothesis says
that the dierence is real.
More
Examples
A test statistic is used to measure the dierence between the
data and what is expected in the null hypothesis. z -tests use
z=
observed − expected
SE
The P -value or the observed signicance level is the chance of
getting a test statistic as extreme as, or more extreme than the
observed one.
Hypotheses
Test
Statistics
P -values
The smaller this chance is, the stronger the evidence against
the null hypothesis.
ˆ If P is less than 5%, the result is called statistically
signicant
ˆ If P is less than 1%, the result is called highly signicant
A P -value larger than 5% does not prove that the null
hypothesis is true. It gives an indication for, that the null
hypothesis might be true, but it cannot be known for sure.
This is why investigators usually proceed by rejecting the
opposite of what they want to prove.
Download