Section 7.1 - Winona State University

advertisement
STAT 110: Section 7.1 – Comparing Two Population Means Using
Dependent Samples
May 2012
In this section, we will investigate methods for comparing two or more population
means. Specifically, we will discuss the following:
 Comparing two means with dependent samples (Section 7.1)
 Comparing two means with independent samples (Section 7.2)
7.1 - COMPARING TWO POPULATION MEANS: DEPENDENT SAMPLES
First, we will consider methods that allow us to make comparisons on numerical
variables between two different groups. The hypothesis testing procedures presented in
this section should be used when the groups being compared are NOT INDEPENDENT
of each other. Whether or not two groups are independent or dependent usually is
determined by how the data is collected. Consider the following example.
Example 7.1: The degree of clinical agreement among different physicians on the
presence or absence of generalized lymphadenopathy was assessed in 32 randomly
selected participants from a prospective study of male sexual contacts of men with
AIDS or an AIDS-related condition. The total number of palpable lymph nodes was
assessed by two physicians, and interest lies in comparing the two physicians. The
data are given in the file LymphNodes.JMP. A portion of the data is shown below.
Comment: For these data, the first observation listed under Doctor A IS directly related
to the first observation listed under Doctor B because the two measurements are being
made on the same patient. We have randomly selected patients; therefore, once we have
chosen Patient 1 to be assessed by Doctor A, the same patient will be assessed by Doctor
B. Thus, these two samples are therefore dependent. If each observation in from
population 1 is directly related to one observation from population 2 then we have
dependent samples.
150
STAT 110: Section 7.1 – Comparing Two Population Means Using
Dependent Samples
May 2012
Could this experiment be conducted in such a way that the samples are independent?
If so, how?
What are the disadvantages of this approach?
Back to Example 7.1
Research Question: Is there statistical evidence that these two physicians disagree in
their assessments of generalized lymphadenopathy?
To answer this question, the testing procedure works with the DIFFERENCES and not
the actual measurements. We do this to remove the structure of dependence between
the two groups. In JMP, we create an additional column (by double-clicking on the
empty column next to “‘Doctor B”) and title it “Difference”. Right click on the new
column header and select Formula.
In the edit window, tell JMP to calculate the difference as follows:
151
STAT 110: Section 7.1 – Comparing Two Population Means Using
Dependent Samples
May 2012
Click Apply and then OK, and JMP returns the following:
Now, the parameter of interest is the true population AVERAGE OF THE
DIFFERENCES which we will represent by μdifference.
 Our best estimate for this parameter is the sample mean of the observed
differences. We’ll call this quantity x difference .
 The sample standard deviation of the differences will be denoted by sdifference.
Comment: Note that these differences represent a single column of data. Therefore, the
testing procedure is the SAME as the procedure for testing a single population mean we
covered in Section 6.1
The Procedure for Comparing Two Population Means (with Dependent Samples)
Step 0:
Check the assumptions behind the test to be sure that the test is valid. For this
particular hypothesis test, we must check the following:
a. Either the number of pairs is sufficiently large, OR
b. It is reasonable to assume the DIFFERENCES are normally distributed.
Step 1:
Convert the research question into H0 and Ha.
Two-Tailed Test:
Ho:
Ha:
152
STAT 110: Section 7.1 – Comparing Two Population Means Using
Dependent Samples
May 2012
Upper-Tailed Test:
Ho:
Ha:
Lower-Tailed Test:
Ho:
Ha:
Step 2:
Determine α, the level of significance.
Step 3:
Calculate a test statistic from your data. For this test, the test statistic is
t
Step 4:
x difference  μdifference
s difference
n
(Note: this the same test statistic used in Section 6)
Determine the p-value and make a decision concerning H0.
Decision Rule: If the p-value is less than α, then the data is said to support the
alternative hypothesis. That is, we reject H0 in favor of Ha.
Step 5:
Write a conclusion in terms of the original research question. You should
state your p-value in your conclusion.
153
STAT 110: Section 7.1 – Comparing Two Population Means Using
Dependent Samples
May 2012
Back to Example 7.1: Carry out the hypothesis test to determine whether there is
statistical evidence that these two physicians disagree in their assessment of
generalized lymphadenopathy.
When setting up hypotheses, let µ1 and µ2 represent the true mean number of
palpable lymph nodes found by Doctor A and Doctor B, respectively.
Step 0:
Check the assumptions behind the test to be sure that the test is valid.
Is the number of pairs sufficiently large? If not, is it reasonable to
assume the differences are normally distributed?
Select Analyze > Distribution and move Difference to the Y,
Columns box.
154
STAT 110: Section 7.1 – Comparing Two Population Means Using
Dependent Samples
May 2012
Step 1:
Convert the research question into H0 and Ha.
H0:
Ha:
Step 2:
Determine α, the level of significance.
Step 3:
Calculate a test statistic from your data.
155
STAT 110: Section 7.1 – Comparing Two Population Means Using
Dependent Samples
May 2012
Verify the test statistic using our formula:
t
Step 4:
x difference  μdifference
=
s difference
n
Determine the p-value and make a decision concerning H0.
156
STAT 110: Section 7.1 – Comparing Two Population Means Using
Dependent Samples
May 2012
p-value =
Decision:
Step 5:
Write a conclusion in terms of the original research question.
“We have evidence that these two physicians disagree in their
assessments of generalized lymphadenopathy (p-value < .0001).”
Next, obtain the 95% confidence interval for the difference in means:
Questions:
1. Do the doctors agree in the number of palpable lymph nodes found?
2. Which doctor tends to find more palpable lymph nodes?
3. To what degree do the doctors differ?
157
STAT 110: Section 7.1 – Comparing Two Population Means Using
Dependent Samples
May 2012
Example 7.2: The data in the file Captopril.JMP give the diastolic blood pressures for 15
patients with moderate essential hypertension, immediately before and two hours
after taking a drug, captopril. Our interest is in investigating the response to the
drug treatment. In particular researchers would like to determine if the systolic
blood pressure decreases by more than 10 mmHg and whether the diastolic BP
decreases by more than 5 mmHg following taking Captopril.
Let µ1 and µ2 represent the population means of the blood pressures before and after
taking the drug, respectively.
Question: Are these samples dependent or independent? Explain.
The research question is whether or not the average blood pressure is lower after
taking the drug. Using JMP, carry out the formal hypothesis test to answer this
question.
Step 0:
Step 1:
Check the assumptions behind the test to be sure that the test is valid.
Is the number of pairs sufficiently large? If not, is it reasonable to
assume the differences in systolic and diastolic blood pressures are
normally distributed?
Systolic Only
Convert the research question into H0 and Ha.
H0:
Ha:
158
STAT 110: Section 7.1 – Comparing Two Population Means Using
Dependent Samples
May 2012
Step 2:
Determine α, the level of significance.
Step 3:
Calculate a test statistic from your data.
Step 4:
Determine the p-value and make a decision concerning H0.
p-value =
Decision:
159
STAT 110: Section 7.1 – Comparing Two Population Means Using
Dependent Samples
May 2012
Step 5:
Write a conclusion in terms of the original research question.
Finally, construct a 95% confidence interval for the difference in systolic blood
pressure means:
Questions:
1. Interpret this confidence interval.
2. Does this interval agree with the results of the hypothesis test? Explain.
160
STAT 110: Section 7.1 – Comparing Two Population Means Using
Dependent Samples
May 2012
Example 7.2 (cont’d): Diastolic blood pressure
For diastolic blood pressure we have the following results.
Ho:
Ha:
p-value =
Conclusion:
Interpret the confidence interval for the mean difference in diastolic blood pressure .
Is this a contradiction of our conclusion from the hypothesis test? Explain.
161
Download