Physicians’ Health Study – Regular Aspirin Intake v. Myocardial Infarction The Physicians’ Health Study Research Group at Harvard Medical School conducted a 5-year randomized study to test whether regular aspirin intake reduces the likelihood of mortality from cardiovascular disease. There were 22,071 physicians who participated in the study, with 11,037 being randomly assigned to take an aspirin tablet every other day, and the remaining physicians assigned to take a placebo tablet every other day. During the 5 years, the frequency of occurrence of myocardial infarction (heart attack) was tabulated for all physicians. Some of the physicians had fatal attacks; others had nonfatal attacks. The results are presented in the contingency table below. Fatal Attack Placebo Aspirin 18 5 Myocardial Infarction Nonfatal No Attack Attack 171 10845 99 10933 The SAS program below is used to analyze the results of the study. The Myocardial Infarction variable is dichotomized, with the categories Fatal Attack and Nonfatal Attack being combined. proc format; value drugfmt value hrt2fmt 1 2 1 2 = = = = "Placebo" "Aspirin"; "MI" "No MI"; ; data three; input drug heart count; format drug drugfmt. heart hrt2fmt.; cards; 1 1 189 1 2 10845 2 1 104 2 2 10933 ; proc freq; weight count; tables drug*heart / all; title "Test of Association Between Aspirin Intake"; title2 "And Myocardial Infarction Events"; title3 "Fatal and Non-Fatal Categories Collapsed"; ; proc corr; weight count; var drug heart; title "Pearson Correlation Coefficient Between"; title2 "Dichotomous Variables Drug and"; title3 "Myocardial Infarction Events"; ; run; The format procedure assigns labels to two created formats – drugfmt and hrt2fmt. The data statement creates a data set called “one.” The input statement tells SAS to read three variable values from each line of data – two dichotomous variables (drug and heart) and a third variable called count that lists the cell frequencies from the table. The format statement assigns the created formats to their respective variables. The cards statement marks the beginning of the data list. The frequency procedure analyzes the two-way contingency table that is produced by the tables statement. The weight statement in the frequency procedure weights each combination of values of drug and heart by the frequency in that cell of the table. The all option in the tables statement tells SAS to produce all possible statistical results; we will examine only a few of them. Since both variables are dichotomous, we may also calculate the Pearson correlation coefficient as a measure of ordinal association. The correlation procedure provided this result. The output of the SAS program is listed below. Test of Association Between Aspirin Intake And Myocardial Infarction Events 14:29 Thursday, October 9, 2008 Fatal and Non-Fatal Categories Collapsed The FREQ Procedure Table of drug by heart drug heart Frequency‚ Percent ‚ Row Pct ‚ Col Pct ‚MI ‚No MI ‚ Total ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Placebo ‚ 189 ‚ 10845 ‚ 11034 ‚ 0.86 ‚ 49.14 ‚ 49.99 ‚ 1.71 ‚ 98.29 ‚ ‚ 64.51 ‚ 49.80 ‚ ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Aspirin ‚ 104 ‚ 10933 ‚ 11037 ‚ 0.47 ‚ 49.54 ‚ 50.01 ‚ 0.94 ‚ 99.06 ‚ ‚ 35.49 ‚ 50.20 ‚ ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Total 293 21778 22071 1.33 98.67 100.00 Statistics for Table of drug by heart Statistic DF Value Prob ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Chi-Square 1 25.0139 <.0001 Likelihood Ratio Chi-Square 1 25.3720 <.0001 Continuity Adj. Chi-Square 1 24.4291 <.0001 Mantel-Haenszel Chi-Square 1 25.0128 <.0001 Phi Coefficient 0.0337 Contingency Coefficient 0.0336 Cramer's V 0.0337 1 Fisher's Exact Test ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Cell (1,1) Frequency (F) 189 Left-sided Pr <= F 1.0000 Right-sided Pr >= F 3.253E-07 Table Probability (P) Two-sided Pr <= P 1.516E-07 5.033E-07 Test of Association Between Aspirin Intake And Myocardial Infarction Events 14:29 Thursday, October 9, 2008 Fatal and Non-Fatal Categories Collapsed The FREQ Procedure Statistics for Table of drug by heart Statistic Value ASE ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Gamma 0.2938 0.0561 Kendall's Tau-b 0.0337 0.0065 Stuart's Tau-c 0.0077 0.0015 Somers' D C|R Somers' D R|C 0.0077 0.1471 0.0015 0.0282 Pearson Correlation Spearman Correlation 0.0337 0.0337 0.0065 0.0065 Lambda Asymmetric C|R Lambda Asymmetric R|C Lambda Symmetric 0.0000 0.0077 0.0075 0.0000 0.0015 0.0015 Uncertainty Coefficient C|R Uncertainty Coefficient R|C Uncertainty Coefficient Symmetric 0.0081 0.0008 0.0015 0.0032 0.0003 0.0006 Estimates of the Relative Risk (Row1/Row2) Type of Study Value 95% Confidence Limits ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Case-Control (Odds Ratio) 1.8321 1.4400 2.3308 Cohort (Col1 Risk) 1.8178 1.4330 2.3059 Cohort (Col2 Risk) 0.9922 0.9892 0.9953 Sample Size = 22071 2 Test of Association Between Aspirin Intake And Myocardial Infarction Events 14:29 Thursday, October 9, 2008 Fatal and Non-Fatal Categories Collapsed The FREQ Procedure Summary Statistics for drug by heart Cochran-Mantel-Haenszel Statistics (Based on Table Scores) Statistic Alternative Hypothesis DF Value Prob ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 1 Nonzero Correlation 1 25.0128 <.0001 2 Row Mean Scores Differ 1 25.0128 <.0001 3 General Association 1 25.0128 <.0001 3 Estimates of the Common Relative Risk (Row1/Row2) Type of Study Method Value 95% Confidence Limits ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Case-Control Mantel-Haenszel 1.8321 1.4400 2.3308 (Odds Ratio) Logit 1.8321 1.4400 2.3308 Cohort (Col1 Risk) Mantel-Haenszel Logit 1.8178 1.8178 1.4330 1.4330 2.3059 2.3059 Cohort (Col2 Risk) Mantel-Haenszel Logit 0.9922 0.9922 0.9892 0.9892 0.9953 0.9953 Total Sample Size = 22071 Pearson Correlation Coefficient Between Dichotomous Variables Drug and 14:29 Thursday, October 9, 2008 Myocardial Infarction Events The CORR Procedure 2 Variables: drug heart Weight Variable: count Variable drug heart N 4 4 Mean 1.50007 1.98672 Simple Statistics Std Dev 42.88648 9.81683 Sum 33108 43849 Pearson Correlation Coefficients, N = 4 Prob > |r| under H0: Rho=0 drug heart drug 1.00000 0.03367 0.9663 heart 0.03367 0.9663 1.00000 Minimum 1.00000 1.00000 Maximum 2.00000 2.00000 4 The point estimate of the odds ratio is n n 18910933 1.8321 . ˆ 11 22 n12 n21 10845104 As derived in class, the 95% large-sample confidence interval for the log of the odds ratio is n n 18910933 1 1 1 1 1 1 1 1 ln 11 22 z ln 1.96 n11 n22 n12 n21 189 10933 10845 104 10845104 n12 n22 2 0.6054 0.2408 0.3647, 0.8462 . Then the 95% large-sample confidence interval for the odds ratio is exp 0.3647, exp 0.8462 1.4400, 2.3308. These may be found in both the last table on page 2 and the last table on page 3 of the SAS output. We are 95% confident that the odds of a myocardial infarction for those taking a placebo are between 1.4400 and 2.3308 times as great as the odds of a myocardial infarction for those taking aspirin. Since both variables are dichotomous, they may also be considered to be ordinal, and we may use the gamma measure of ordinal association. Since the formulae for the M.L.E. and the A.S.E. for gamma are rather complicated, I will just note that the 95% large-sample C.I. for gamma is given in the first table on page 2 of the output as ˆ z A.S .E.ˆ 0.2938 1.960.0561 0.1838, 0.4038 . We can say with confidence that 2 there is a positive association between the two variables – those who take aspirin are less likely to have a myocardial infarction than those who take a placebo. The Pearson correlation coefficient, given in the last table on page 4 of the SAS output, is found to be ˆ r 0.03367 , denoting a weak positive relationship between the two variables. Note that this result is also given in the first table on page 2 of the output, both as the Pearson correlation and as the Spearman correlation. These two numbers are the same, since both variables are dichotomous.