Comparing Two Populations Means Using Dependent Samples

advertisement
Making Decisions for the Difference between
Two Dependent Population Means
Question:
How do I know if the measurements have some dependence (or lack of independence)
structure?
Answer:
Ask yourself the following question: Do the observations from the 1st group have any
significant relationship with the observations from the 2nd group?
Group 1
Group 2
Observation 1
Observation 1
Observation 2
Observation 2
:
:
Observation n
Observation n
If your answer is NO, then
Use the Difference between Two Independent Population Means hypothesis test.
Otherwise,
Use the Difference between Two Dependent Population Means hypothesis test.
Question:
How do we work with Two Dependent Population Means?
Answer:
Somehow make them independent... We can remove the dependence structure by
subtracting the observations from the 2nd group from the observations in the 1st group.
Group 1
Group 2
Group1 - Group 2
Observation 1
Observation 1
Difference 1
Observation 2
Observation 2
Difference 2
:
:
:
1
Observation n
Observation n
Difference n
Now, realize that Difference 1 is independent of Difference 2, which is independent of
Difference 3, etc... Essentially, we have n independent observations and the parameter of
interest is Difference .
Data File:
Blood.JMP
Background: These data come from a study of the effects of a drug called
Captopril on both the systolic and diastolic blood pressures of
patients one half hour after taking the drug.
Variables:
Syspre: the initial systolic blood pressure of the patient
Syspost: the systolic blood pressure 30 minutes after taking the
drug
Diapre: the initial diastolic blood pressure of the patient
Diapost: the diastolic blood pressure 30 minutes after taking the
drug
Goal:
To be able to complete (and interpret the output) for a test of
differences for two groups that are dependent of each other..
Question of
Interest:
It is believed that the drug Captopril decreases both sys and dia
blood pressure readings. Determine whether or not the drug
decreases the sys reading. Putting this in a statement involving
parameters.
First, realize these two groups are dependent of each other
because the blood pressure readings are taken on each person.
As a result, the analysis is done 'within' or 'by' individuals.
Obtaining the Differences:
To calculate a column of differences, we need to add a new column. Double-click to the
right of the last column and name the column sysdiff. Then double-click at the top of the
column to obtain the column info window and select Formula from the New Property
pull-down menu. Then click Edit Formula which will open the JMP calculator and
enter the formula shown below:
To do this first click on the variable syspre from the list that appears in the top left corner
of the calculator window which will enter that variable into the calculator window. Next
select the minus sign from the buttons located in the middle of the window and finally
add the variable syspost to the expression by double-clicking on it in the variable list.
Your expression should now look like the one above. When you are finished click OK.
2
Intuitive Decision
In order to determine whether or not the null or alternative hypothesis is true, you should
first review the summary statistics for the differences. Remember, all summary statistics
are for the observations you sampled. In order to make decisions about all observations
of interest, we must apply some inferential technique (i.e. hypothesis tests or confidence
intervals). Recall, to get summary statistics for a numerical variable, select Analyze >
Distribution. The variable to summarize is the differences just created, sysdiff.
Assumptions
1. The observations, i.e. the paired differences, should follow a normal distribution.
To check this assumption, use a normal quantile plot which can be obtained by checking
'Normal Quantile Plot' from the sysdiff pull-down menu.
Do you think normality is being satisfied? Explain.
3
Performing the test
To perform the paired t-test using these differences select Test Mean ... from the sysdiff
pull-down menu located at the top of the resulting window, entering 0 for the
hypothesized value of the mean. After clicking OK the following output will be
obtained.
There are three p-values reported for each test along with the value of the t-test statistic.
The p-values are for a two-tailed, upper-tailed, and lower-tailed, respectively.
What type of test do we have here? Explain.
What is the appropriate p-value?
What is our decision for the test?
Write a conclusion for your findings.
From the Moments box , we see that the likely range for the average difference between
pre and post sys blood pressure is (13.93, 23.93). Again, this is a 95% confidence
interval for Pre Sys - Post Sys is (13.91,23.93). Interpret the meaning of this interval. Does
this agree with what you found in above using the hypothesis test? Explain.
4
An nonparametric alternative
Similar to the procedure for testing a single mean, if we believe that the distribution of
differences is not normal, but is symmetric, then a nonparametric test may be
appropriate. To obtain a nonparametric test we check the box labeled Wilcoxon SignedRank Test in the Test Mean ... window. The results of this test are shown below.
The column labeled Signed-Rank contains the results of the Wilcoxon Signed-Rank test.
As we can see the p-value associated with both the two-tailed and upper-tailed tests are
zero to three decimal places providing strong evidence that the systolic blood pressures
have decreased.
5
Download