Chapter26 Chapter26 Examples Examples Denitions Denitions Hypotheses Hypotheses Test Statistics P -values More Examples Part VIII Tests of Signicance Test Statistics P -values More Examples Examples from the news Examples Denitions Hypotheses Test Statistics P -values More Examples A test of signicance deals with the question whether an observation can be reasonably explained by chance or not. These tests are widely used, so it is important to understand how they work. Chapter 26 Tests of Signicance Chapter26 Context New terminology: null hypothesis, alternative hypothesis, test-statistic, P -value Chapter26 Example 1 Examples Men get greater satisfaction than women from seeing someone they dislike suer pain shows a study of how people react when witnessing revenge. Scientists found highly signicant dierences between the genders in how male and the female brains respond. The director of British Swimming says: 'Our objective is to achieve signicantly more in London [in 2012] than Bejing'. Further, he says that money could 'make a very signicant dierence' to performance. Exposure to asbestos bres in homes and other buildings where asbestos materials are present and in good condition is not normally signicantly dierent to that from background exposure and is therefore not a cause for concern. Denitions Hypotheses Test Statistics P -values More Examples A senator introduces bill that simplies the tax code. He claims that the bill is 'revenue neutral'. This means that the total tax revenues will stay the same. To check the claim, the Treasury Department uses a computer le of 100,000 representative tax returns. For a simple random sample of 100 returns, it calculates change = tax under new rules − tax under old rules The sample average comes out as a change of -$219, with an SD of $725. Example 1 Chapter26 Chapter26 Examples Examples Denitions Denitions Hypotheses Hypotheses Test Statistics Test Statistics P -values More Examples There are two dierent explanations for the average change of -$219. (i) The senator is right, and the bill is indeed revenue neutral. In other words, the true average change among the 100,000 returns is zero, and the observed average change of -$219 (from the sample of 100) is due to chance error. (ii) The senator is wrong, and the bill is not revenue neutral. The dierence is real, and tax revenues will go down under the new bill. Example 1 Chapter26 P -values More Examples Under assumption (i), the average change is like the average of 100 draws from the box. Chapter26 Examples Denitions Denitions Hypotheses Hypotheses Test Statistics Test Statistics More Examples Expected value for the average: SE for the sum: √ $0 100 × $725 = $7, 250 $7, 250 SE for the average: = $72.50 100 The observed average is -$219, or (−$219) − ($0) ≈ −3 $72.50 in standard units. The probability histogram for the average of the draws is very close to the normal curve. The idea of a test of signicance is the following. Assume that (i) is true. Then the observed dierence of -$219 is due to chance variability. We can set up a box model: 100,000 tickets, each showing the change for that taxpayer. Average of box: $0 SD of box: $725 Examples P -values Example 1 Example 1 P -values More Examples The area under the normal curve between -3 and 3 is about 99.7%. The chance in our example is equal to the area under the curve to the left of -3. That area is approximately 0.15%. So if explanation (i) is true, there is only one chance in 1,000 to get an average of -$219 or below. This is strong evidence against explanation (i). If we stick to it, we need a small miracle to explain the data. Chapter26 The Null and the Alternative Examples Examples Denitions Denitions Hypotheses Test Statistics P -values More Examples Hypotheses Denition In a test of signicance, the null hypothesis expresses the idea that an observed dierence is due to chance. To make a test, the null hypothesis has to be set up as a box model for the data. The alternative hypothesis is another statement about the box; it says that the dierence is real. In Example 2, null hypothesis - explanation (i); the average of the box equals $0 alternative hypothesis - explanation (ii); the average of the box is less than $0 Test statistics Chapter26 Test Statistics P -values More Examples This is an example of a test statistic. Denition A test statistic is used to measure the dierence between the data and what is expected in the null hypothesis. Chapter26 Examples Denitions Denitions Hypotheses P -values More Examples In the example, we computed how many SEs away the observed value of the sample average was from its expected value: (−$219) − ($0) ≈ −3 $72.50 Examples Test Statistics Test statistics Chapter26 P -values Hypotheses The test statistics used in our example is usually called z : z= observed − expected SE and tests using the z -statistic are called z -tests. (There are very many other tests). Denition z says how many SEs away an observed value is from the expected value, where the expected value is calculated based on the null hypothesis. Test Statistics P -values More Examples The z -statistic was -3, and the area to the left of -3 under the 1 normal curve is about 0.15% or 1000 . This is called an observed signicance level and often denoted by P . We also refer to this number as a P -value. Denition The observed signicance level is the chance of getting a test statistic as extreme as, or more extreme than the observed one. The chance is calculated on the basis that the null hypothesis is true. The smaller this chance is, the stronger the evidence against the null hypothesis. Chapter26 P -values Examples Examples Denitions Denitions Hypotheses Hypotheses Test Statistics P -values More Examples Chapter26 How small should the observed signicance level be before we reject the null hypothesis? Many scientists draw the line at 5% or 1%: If P is less than 5%, the result is called statistically signicant If P is less than 1%, the result is called highly signicant This formalizes our basic idea. When the data are too far from the predictions of a theory, that means trouble. We better reject the null hypothesis if the observed value is too many SEs away from the expected value. Examples from the news Examples Denitions Hypotheses Test Statistics P -values More Examples Important Chapter26 Test Statistics P -values More Examples A P -value larger than 5% does not prove that the null hypothesis is true. It just means that there is not enough evidence in the data to reject the null hypothesis. It gives an indication for, that the null hypothesis might be true, but it cannot be known for sure. Setting up a test Chapter26 Examples Men get greater satisfaction than women from seeing someone they dislike suer pain shows a study of how people react when witnessing revenge. Scientists found highly signicant dierences between the genders in how male and the female brains respond. The director of British Swimming says: 'Our objective is to achieve signicantly more in London [in 2012] than Bejing'. Further, he says that money could 'make a very signicant dierence' to performance. Exposure to asbestos bres in homes and other buildings where asbestos materials are present and in good condition is not normally signicantly dierent to that from background exposure and is therefore not a cause for concern. Denitions Hypotheses Test Statistics P -values More Examples (1) (2) (3) (4) Dene null hypothesis and alternative hypothesis Pick a test statistic Compute the observed signicance level (= P -value) Make a decision Chapter26 Example 2 Chapter26 Examples Examples Denitions Denitions Hypotheses Test Statistics P -values More Examples A biologist tells you that osprings of a new plant have 75% chance to have blue owers, and 25% chance to have white owers. Example 3: Interpreting the P -value Hypotheses Test Statistics P -values More Examples 200 seeds were raised, and 142 had blue owers and 58 had white owers. Which interpretation(s) is/are correct: 1 P -value = chance that the null hypothesis is true 2 P -value = chance that null hypothesis is true, given that we saw the value of the test statistic. 3 P -value = chance of getting a test statistic as extreme as or more extreme than the one we saw, given that the null hypothesis is true Do you believe the biologist? Chapter26 Examples Denitions Hypotheses Test Statistics P -values More Examples Example 4 Two investigators are conducting a test on a box X: Null hypothesis: avg of box = 50 Alt. hypothesis: avg of box < 50 Both use a z -test. The rst investigator takes 100 tickets at random from the box, the other does the same for 900 tickets. Both get SD of 10. True or false: the investigator whose average is further away from 50 will get the smaller P -value. Chapter26 Example 5 Examples Denitions Hypotheses Test Statistics P -values More Examples A large course has 900 students broken into sections of 30 students. On the nal, the average is 63, and the SD is 20. In one section, however, the average is 55. The TA for that section claims this is due to chance error. Is this a good defence? Summary Chapter26 Examples Denitions Hypotheses Test Statistics P -values More Examples Chapter26 Interpreting the P -value Examples A test of signicance deals with the question whether an observation can be reasonably explained by chance or not. Denitions The null hypothesis expresses the idea that an observed dierence is due to chance, while the alternative hypothesis says that the dierence is real. More Examples A test statistic is used to measure the dierence between the data and what is expected in the null hypothesis. z -tests use z= observed − expected SE The P -value or the observed signicance level is the chance of getting a test statistic as extreme as, or more extreme than the observed one. Hypotheses Test Statistics P -values The smaller this chance is, the stronger the evidence against the null hypothesis. If P is less than 5%, the result is called statistically signicant If P is less than 1%, the result is called highly signicant A P -value larger than 5% does not prove that the null hypothesis is true. It gives an indication for, that the null hypothesis might be true, but it cannot be known for sure. This is why investigators usually proceed by rejecting the opposite of what they want to prove.