Notes Mar 2, 2003

advertisement
Notes
Mar 2, 2003
What does the hypothesis testing method do? It uses data from a sample to judge
whether or not a statement about a population may be true.
11.1 Formulating Hypothesis Statements
Many of the questions that researchers ask can be expressed as questions about which of
two statements might be true for a population.
For example:
1) Do female students study, on average, more than male students do?
2) Does a new drug have smaller side effects than the old one does?
3) Do smokers tend to drink more as well?
4) Is the proportion of the male students who have at least one tattoo different from
the proportion of the female students who have at least one tattoo?
All these questions can be answered with a “yes” or “no”, and each possible answer is a
specific statement about a situation. For instance, for question 2), we can break the
question into 2 competing hypothesis:
a. That new drug has smaller side effects than the old one does.
b. That new drug does not have smaller side effects than the old one does.
In statistics, the two possible answers are call the null hypothesis and alternative
hypothesis.
The null hypothesis, represented by the symbol H0, is a statement that there is nothing
happening. Generally, we hope to disprove or reject the null hypothesis.
The alternative hypothesis, represented by the symbol Ha, is a statement that
something is happening. In most situations, we hope to prove the alternative hypothesis
is right.
For instance, in Example 2), the null hypothesis, H0, is “the new drug does not have
smaller side effects than the old one does”. This is the statement that we assume
“nothing is happening”;
Page 1 of 6
The alternative hypothesis, Ha, is “the new drug does have smaller side effects than the
old one does”. This is the statement that we assume “something is happening”.
Question 1: Write down the null and alternative hypothesis for example 1), 3), and 4).
As we might notice, example 1), 2), and 3) have the alternative hypotheses that test “if
something is greater (or smaller) than the other”. These alternative hypotheses include
values in one direction only (either greater, or smaller, but not both). We call this kind
of test one-sided or one tailed hypothesis tests.
Example 4) asks if one proportion is different from the other. Therefore, this
proportion, as long as it is different from the other proportion, can be either greater, or
smaller than the other one. Alternative hypothesis like this includes values in either
direction from a specific standard. We call them two-sided or two-tailed hypothesis
test.
The logic of hypothesis testing: what if the null is true? In hypothesis testing, we
always assume that the null hypothesis a possible truth until the sample data
conclusively demonstrates otherwise.
11.3 Deciding Between the Two Hypotheses
Page 2 of 6
Two terms we should pay attention to:
1. Test statistic is the data summary that we use to evaluate the two hypotheses.
2. p-value is used to describe the likelihood that we would have observed what we
did, or something even more extreme, if the null hypothesis is true.
NOTE: This is the second time we meet the notion “p-value”. The first time we
introduced p-value to decide whether the relationship between 2 variables is statistically
significant. This time, p-value is computed by assuming the null hypothesis is true and
then determining the probability of a result as extreme (or more extreme) as the
observed test statistic in the direction of the alternative hypothesis.
In general, a test statistic is simply a summary that compares the sample data to the null
hypothesis. The chi-square statistic we mentioned before is a special case of a test
statistic.
We also need to emphasize that a p-value does not tell us the probability that the null
hypothesis is true. Instead, it only tells us the probability that our test statistic could
have been as extreme as it is, if we assume the null hypothesis is true.
In order to introduce the idea of rejecting the null hypothesis, we need two more terms
first:
1. Statistically significant is used to describe the result when the researcher has
decided that the p-value is small enough to decide in favor of the alternative
hypothesis.
2. Level of significance, also called the  level, is the borderline for deciding that
the p-value is small enough to justify choosing the alternative hypothesis. A
result is statistically significant when the p-value is less than the chose level of
significance. Again, we usually choose 0.05 as our level of significance.
We can summarize the relationship between p-value and hypothesis testing using the
following table:
p-value small (usually less than 0.05) p-value is not small (usually larger
than 0.05)
Reject the null hypothesis, or
Do not reject the null hypothesis.
equivalently, we accept the
alternative hypothesis.
Page 3 of 6
Question 2: For example 1), suppose we know that the average study time of male
students, male, is 13.5 hours per week. Let female denoted as the average study time of
female students. Write down the formal two hypothesis statements. Moreover, assume
that the p-value we calculate is 0.12. Using significance level 0.05, what should be our
conclusion? Interpret the meaning of p-value in this case.
11.4 Testing Hypothesis about a proportion
NOTE: The formulas in this section are specific to test proportions, but the basic steps
of hypothesis testing are the same in any setting.
Steps in any hypothesis test:
1. Determine the null and alternative hypothesis;
2. Summarize the data into an appropriate test statistic;
3. Assuming the null hypothesis is true, find the p-value;
4. Decide whether or not the result is statistically significant based on the p-value.
So far, we have discussed step 1 in previous sections. For step 2, when we have a
sufficient large random sample, we can use a “z-test” to examine hypotheses about a
population proportion. The corresponding test statistic is called z-statistic or z-value.
A “sufficient large” random sample is one for which both np0 and 1 – np0 are at least 10,
where p0 is the value of the population proportion specified in the null hypothesis.
Page 4 of 6
Recall: The 2 sufficient conditions to use normal approximation rule for the distribution
of sample proportion.
Most software will provide z-statistic.
How to calculate z-statistic by hand?
Z=
p̂  p 0
sample estimate  null value
=
, with
standard error
standard error
standard error =
p 0 (1  p 0 )
.
n
1. p̂ represents the sample estimate of the proportion;
2. p0 represents the specific value in the null hypothesis;
3. n is the sample size.
In order to use z-test, we must make certain assumptions. They are:
1. The sample should be a random sample from the population.
2. The quantities np0 and 1 – np0 should be at least 10.
Example: Suppose the present success rate in the treatment of a particular psychiatric
disorder is 0.65 (65%). A research group hopes to demonstrate that the success rate of a
new treatment will be better than this standard. Suppose we look at 200 patients and
find that 140 have success with the new treatment.
Example Minitab Output
Test and CI for One Proportion
Test of p = 0.65 vs p > 0.65
Sample
1
X
140
N
200
Sample p
0.700000
95.0% Lower Bound
0.646701
Z-Value
1.48
P-Value
0.069
Using the 4 steps we mentioned previously, we have:
a. H0: p = 0.65 vs. Ha: p > 0.65;
b. z-test will be appropriate to use in this case (why?). The z-statistic for our
problem is 1.48 (how do you compute it?).
c. The p-value is 0.069 from the Minitab output.
Page 5 of 6
d. Since p-value is greater than 0.069, we claim the result of the new
treatment is not statistically significant at 0.05 significant level. Hence, we
cannot reject the null, and cannot make the conclusion that the new
treatment is more effective.
Question 3: Decide if you can use z-test on the following cases. If yes, determine the zstatistic as well.
1. In order to test whether the proportion of the students who owns a PC in PSU is
0.6, we draw a random sample of 100 students from PSU. We then find 58 of
them have their own PCs.
2. In order to test whether the proportion of the students who own a PC in PSU is
0.6, we draw a random sample of 20 students from PSU. We then find 13 of
them have their own PCs.
3. In order to test whether the proportion of the students who own a PC in PSU is
0.6, we use a stat 200 class as a sample (assume they have 50 people in their
class). We then find out 31 of them have their own PCs.
Page 6 of 6
Download