Making Decisions for the Difference between Two Dependent Population Means Question: How do I know if the measurements have some dependence (or lack of independence) structure? Answer: Ask yourself the following question: Do the observations from the 1st group have any significant relationship with the observations from the 2nd group? Group 1 Group 2 Observation 1 Observation 1 Observation 2 Observation 2 : : Observation n Observation n If your answer is NO, then Use the Difference between Two Independent Population Means hypothesis test. Otherwise, Use the Difference between Two Dependent Population Means hypothesis test. Question: How do we work with Two Dependent Population Means? Answer: Somehow make them independent... We can remove the dependence structure by subtracting the observations from the 2nd group from the observations in the 1st group. Group 1 Group 2 Group1 - Group 2 Observation 1 Observation 1 Difference 1 Observation 2 Observation 2 Difference 2 : : : 1 Observation n Observation n Difference n Now, realize that Difference 1 is independent of Difference 2, which is independent of Difference 3, etc... Essentially, we have n independent observations and the parameter of interest is Difference . Data File: Blood.JMP Background: These data come from a study of the effects of a drug called Captopril on both the systolic and diastolic blood pressures of patients one half hour after taking the drug. Variables: Syspre: the initial systolic blood pressure of the patient Syspost: the systolic blood pressure 30 minutes after taking the drug Diapre: the initial diastolic blood pressure of the patient Diapost: the diastolic blood pressure 30 minutes after taking the drug Goal: To be able to complete (and interpret the output) for a test of differences for two groups that are dependent of each other.. Question of Interest: It is believed that the drug Captopril decreases both sys and dia blood pressure readings. Determine whether or not the drug decreases the sys reading. Putting this in a statement involving parameters. First, realize these two groups are dependent of each other because the blood pressure readings are taken on each person. As a result, the analysis is done 'within' or 'by' individuals. Obtaining the Differences: To calculate a column of differences, we need to add a new column. Double-click to the right of the last column and name the column sysdiff. Then double-click at the top of the column to obtain the column info window and select Formula from the New Property pull-down menu. Then click Edit Formula which will open the JMP calculator and enter the formula shown below: To do this first click on the variable syspre from the list that appears in the top left corner of the calculator window which will enter that variable into the calculator window. Next select the minus sign from the buttons located in the middle of the window and finally add the variable syspost to the expression by double-clicking on it in the variable list. Your expression should now look like the one above. When you are finished click OK. 2 Intuitive Decision In order to determine whether or not the null or alternative hypothesis is true, you should first review the summary statistics for the differences. Remember, all summary statistics are for the observations you sampled. In order to make decisions about all observations of interest, we must apply some inferential technique (i.e. hypothesis tests or confidence intervals). Recall, to get summary statistics for a numerical variable, select Analyze > Distribution. The variable to summarize is the differences just created, sysdiff. Assumptions 1. The observations, i.e. the paired differences, should follow a normal distribution. To check this assumption, use a normal quantile plot which can be obtained by checking 'Normal Quantile Plot' from the sysdiff pull-down menu. Do you think normality is being satisfied? Explain. 3 Performing the test To perform the paired t-test using these differences select Test Mean ... from the sysdiff pull-down menu located at the top of the resulting window, entering 0 for the hypothesized value of the mean. After clicking OK the following output will be obtained. There are three p-values reported for each test along with the value of the t-test statistic. The p-values are for a two-tailed, upper-tailed, and lower-tailed, respectively. What type of test do we have here? Explain. What is the appropriate p-value? What is our decision for the test? Write a conclusion for your findings. From the Moments box , we see that the likely range for the average difference between pre and post sys blood pressure is (13.93, 23.93). Again, this is a 95% confidence interval for Pre Sys - Post Sys is (13.91,23.93). Interpret the meaning of this interval. Does this agree with what you found in above using the hypothesis test? Explain. 4 An nonparametric alternative Similar to the procedure for testing a single mean, if we believe that the distribution of differences is not normal, but is symmetric, then a nonparametric test may be appropriate. To obtain a nonparametric test we check the box labeled Wilcoxon SignedRank Test in the Test Mean ... window. The results of this test are shown below. The column labeled Signed-Rank contains the results of the Wilcoxon Signed-Rank test. As we can see the p-value associated with both the two-tailed and upper-tailed tests are zero to three decimal places providing strong evidence that the systolic blood pressures have decreased. 5