End of Life Resuscitation Patterns: A Socio-Demographic Study MASSACHUSETTS INSTiTE, I OF TECHNOLOGY of Intensive Care Unit Patients JUN 0 2 2010 LIBRARIES ARCHNES By Sharon L. Lojun, MD SUBMITTED TO THE DIVISION OF HEALTH SCIENCES AND TECHNOLOGY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE IN BIOMEDICAL INFORMATICS AT THE MASSACHUSETTS INSTITUTE OF TECHNOLOGY MAY 17, 2010 @2010 Sharon L. Lojun All rights reserved. The author hereby grants to MIT permission to reproduce and to distribute publicly paper and electronic copies of this thesis document in whole or in part in any medium now known or hereafter created. Si gnature of Author: Division of Health Sciences and Technology May, 2010 Certified by: Regina Barzilay, PhD Associate Professor of Electrical Engineering Computer Science Accepted by: Ram Sasisekharan, I iD Director, Harvard-MIT Division of Health Sciences and Technology; Edward Hood Taplin Professor of Health Sciences & Technology and Biological Engineering Table of Contents I. II. III. IV. V. VI. Table of Contents Dedication Abstract Background and Introduction a. Motivation b. Research Questions c. Methods d. Key Findings e. Contributions Related Work Methods a. Data i. Database ii. Dataset iii. ICU Text iv. Hospital Text b. Nursing Notes as resuscitation code classification i. Text Preprocessing ii. Medical Metrics iii. Demographics iv. BoosTexter classification v. Ablation Algorithm vi. Medical Condition/Text sub-analysis vii. Statistical Analysis c. N-gram Analysis i. Text Preprocessing ii. Pattern Recognition d. Classification Model Creation i. Univariate analysis ii. Multivariate analysis iii. Interaction terms iv. Model performance v. Evaluation of gender vi. Evaluation of age e. Physician Notes as resuscitation code classification i. Text Preprocessing ii. BoosTexter classification iii. Ablation Algorithm f. Physician Notes annotation analysis i. Univariate classification analysis ii. Gender Analysis VII. Results a. Nursing Notes as resuscitation code classification i. Comparative Impact of Nursing Social Text and Medical Metrics ii. Individual Feature Prediction iii. Most Predictive Models b. N-gram Analysis i. Gender differences ii. Visitation differences c. Classification Model Creation i. Feature Prediction ii. Model iii. Model Performance iv. Gender and Age Effect d. Physician Notes as resuscitation code classification i. Individual Feature Prediction ii. Comparison with Nursing and medico-demographic classification iii. Most Predictive Models e. Physician Notes annotation analysis i. Annotation Error ii. Annotation predictions 1. Children 2. Living Situation 3. Marital Status 4. Employment Situation iii. Gender Effect iv. Age Effect VIII. Discussion IX. Summary X. References XI. Acknowledgments XII. Appendix Dedication Devoted to the memory of my mother, Joyce Ann Fleming Lojun; and in honor of my father, Edward Charles Lojun, Sr, in appreciation for their guidance and love; to the memory of G. Tom Shires II, MD, who taught me to aspire to be a surgeon, scientist, physician, and teacher; to Philip Barie, MD, who inspired me technically and intellectually; to Cornell University Medical College; to Mitchell Medow who is a special teacher and special person; to my brother, Edward; to my sister Teresa; to Paulie Pena; to my friends, Christy Sauper and Amy Lapidow; to Isaac Schiff, MD, David Grimes, MD, Wayne Cohen, MD inspirational leaders, and special friends; to Robert Friedman MD, who believed in me and guided me; to Alexa McCray, a women of inspirational ability and integrity; to Pete Szolovits, wonderful teacher, kind person and firm proponent of collaboration; to Regina Barzilay beloved teacher; and to MIT, BU and all my friends at BIRT. Abstract This study investigates the effect of age, gender, medical condition, and daily free text input on classification accuracy for resuscitation code status. Data was extracted from the MIMICII database. Natural language processing (NLP) was used to evaluate the social section of the nurses' progress notes. BoosTexter was used to predict the code-status using text, age, gender, and SAPS scoring. The relative impact of features was analyzed by feature ablation. Social text was the greatest single indicator of code status. The addition of text to medical condition features increased classification accuracy significantly (p<0.001.) N-gram frequency was analyzed. Gender differences were noted across all code-statuses, with women always more frequent (e.g. wife>husband.) Visitors and contact were more common in the less aggressive resuscitation codes. Logistic regression on medical, age, and gender features was used to determine gender bias or ageism. Evidence of bias was found; both females (OR=1.47) and patients over age 70 (OR=3.72) were more likely to be DNR. Feature ablation was also applied to the social section of physician discharge summaries, as well as to The addition of annotated features increased annotated features. classification accuracy, but the nursing social text remained the most individually predictive. The annotated features included: children; living situation; marital status; and working status. Having zero to one child; living alone or in an institution; being divorced or widow or widower; and working, working in white collar job, or being retired, were all associated with higher rates of DNR status, and lower rates of FC status. Contrarily, living with family; being married; and being unemployed, were all associated with lower rates of DNR status, and higher rates of FC status. Some of these findings were gender and/or age dependent. Introduction and Background Motivation: Critical Care is the costliest and most invasive of medical care. The sickest 5% of the U.S population consumes nearly 30% of the health care costs. This care can lead to remarkable recovery; however, in some cases it may lead to prolonged invasive care without benefit. The challenge is to identify the best candidates for ICU care, avoiding needless patient and family suffering, and the waste of trillions of dollars in medical resources. Applying the best resuscitation rules based on the patient's wishes and medical prognosis helps solve this dilemma, especially if applied expediently. Currently, determining resuscitation status during ICU stays is generally left to family and physicians. Understanding resuscitation code assignment in the ICU is complex, and this study lends a preliminary attempt at this need. It is vital to understand the factors which influence code assignment in order to ensure ethical treatment of all patients, and provide treatment in harmony with patients' wishes. Knowledge of specific driving factors of resuscitation code-status is limited. The purpose of this study is to evaluate a large ICU database for family and social characteristics associated with code status, including the relationship to corresponding medical measures and demographic attributes. For the purpose of this paper: Full Code, all resuscitation measures, will be designated FC; Do Not Resuscitate or Do Not Intubate, limited resuscitation in the case of cardiac or pulmonary arrest, will be designated DNR; and Comfort Measures Only will be designated CMO. Research Questions: This study investigates nursing social text (largely a catalogue of family and loved ones' visits, feelings and understanding, and physician meetings), age, gender, and medical condition as predictors of code status assignment. demographic features? Specifically, what are the driving socio- Do these factors match the medical condition? Is there evidence for ageism or gender bias? What are the most frequent unigrams, bigrams, and trigrams? the frequencies? Are there any patterns associated with Can the non text features be modeled to discern the presence of gender bias or ageism? Physician social text of the hospital discharge summary (largely a catalogue of children's' involvement, living situation, marital status, and employment situation) is also investigated as a predictor of code status. What are the most predictive features? Are they age or gender specific? Methods: The machine learning algorithm, BoosTexter, was used on a training set and test set, of the social sections of the nursing ICU notes, and the physician hospital discharge summaries for classification code accuracy. N-gram study of the nursing notes was performed to identify any meaningful trends. Annotation of the physician notes was done by hand. The corpus was classified according to the annotation features. After annotation, BoosTexter was used to classify code status. Logistic Regression was used to create a model of non-text features. All possible interaction terms were evaluated. Social features of the physician hospital summaries were evaluated for feature association findings in code assignment. Key Findings: Code status seems to be more a reflection of family and physician sentiment and assessment, than of unprejudiced medical measure. It is clear from these analyses that medical condition and prognosis, alone, are not likely the leading driving factor of code assignment; and in fact these findings mirror those associated with ICU resource allocation itself. This is the first study, known to the author, to evaluate prediction of code status. The nursing notes alone proved to be a better indicator of code status than the available medical statistics. Age and gender were also highly predictive. When combined with medical features, nursing social text improves classification accuracy remarkably compared to classification on medical metrics alone. Initially, it was noted that the physician notes were not terribly useful as free text alone; and therefore, an annotation was performed. The annotation features were found to be less predictive individually, than nursing text, age or gender; however, they were more predictive than the SAPSI score. In combination with other features, the annotation features reduced the classification error even further. However, the least prediction error was achieved with the logistic regression model. The model was tested for interaction terms, and for performance. The precision of the model was excellent, with an AUC (C index) of 0.784. Calibration, however, was poor. Using the model, an assessment of gender and age effect was made using the Odds Ratios. Women were 1.47 times more likely to be DNR rather than FC compared to men. Those aged 70 or greater were 3.72 times more likely to be DNR rather than FC compared to those younger than age 70. There was found, notably clear, gender bias toward men, and ageism favoring the young. The n-gram study of the nursing social text revealed interesting specific differences in gender involvement, with the female counterpart always more frequent than the male counterpart, regardless of code state. It cannot readily be concluded that the female gender (wife, daughter, sister) has a more decisive role compared with the male gender (husband, son, brother); but it is reasonable to conclude that there is more daily female support. There does not seem to be bias in this regard to any specific code group, as the relationship is consistent across all code states. The findings of the bigram study are not surprising. There are more visitors, and more contact as code status progresses from FC to DNR to CMO. It is likely that family and loved ones are more involved when death seems imminent or more likely. The annotated features of the physician hospital discharge summaries revealed vulnerable groups. For children: having zero children (females only) or one child was associated with decreased rates of FC status, and increased rates of DNR status; having many children did not differ from the baseline rates in the full corpus. For living situation: being institutionalized (nursing home, rehabilitation facility, assisted living, group home, and others) was associated with decreased rates of FC and increased rates of DNR statuses; the same trend was observed for living alone (for older females only), so this is likely a simple age effect; living with family was associated with increased FC rates and decreased DNR rates (no age or gender effect). For marital status: being a widow or widower was associated with decreased rates of FC, and increased rates of DNR, with divorce status following this trend; being married and female had the reverse observation; married males and single people did not differ from the underlying corpus. For employment status: the values working or white collar were both associated with decreased FC, and increased DNR; retired followed this pattern for men only; unemployed females were observed to follow the opposite direction, increased FC, and decreased DNR; blue collar, disabled, and volunteer did not differ from the corpus. Contributions: The contributions of this work lie largely in finding: that the driving force in resuscitation code assignment is not medical condition, but perhaps family sentiment; that women are far more likely to be involved in the care of ICU patients, regardless of the code status; and that numbers of children, living situation, marital status, and employment status weigh heavily on the prediction of code status; and that modeled ageism and gender bias is very marked. Finally, there is no evidence known to the author, that machine learning techniques and logistic regression modeling have been used in pursuit of this information primarily (however, data analysis using logistic regression modeling was done by Philip Barie's group, upon finding gender bias by surprise) (1) The concept of advanced directives (AD), or living wills, has sought to help in making a patient's wishes known and followed; however they are sometimes vague and unable to predict all possible clinical scenarios. Additionally, ADs have not been very successful in the United States, in comparison to Japan, e.g. (2) Cultural differences and many other reasons are cited as the causes. The likelihood to have an AD is dependent on advanced age and on increased income. (3) The elderly are interested in discussing CPR, but do not necessarily want their wishes committed on paper. (4) Joos found, in a self-administered questionnaire of general medical patients, 72% had knowledge of AD, 53% discussed with family, and only 14% had discussed with their physician. (5) Half of the patients felt the terminology should be simplified. (5) So, it may be possible in many instances that the family has the best understanding of the patient's wishes. Of note, the majority of geriatricians do not establish AD. (6) Most patients come to ICU care after a sudden change in health, rather than by a foreseen episode. Code status is generally assigned as Full Code (FC), until it is possible to sort out the likely prognosis, and obtain information about a patient's wishes. Even in the case of AD, it is often difficult to predict whether stabilization will occur with brief critical care interventions, and therefore difficult for the physician to interpret the AD in all situations. As a result, patients are defaulted to Full Code (FC) status; less aggressive code assignment, such as do-not-resuscitate (DNR), do-not-intubate (DNI), and comfort-measures-only (CMO), usually does not occur until after entry into the ICU. In many cases, the patient is unable to communicate due to treatment or illness. The assignment of code status would then be made by the closest family relative in conjunction with the medical staff. Related Work Eachempati prospectively studied 723 patients undergoing emergency surgery. (1) The outcome measures were age, sex, admission diagnosis, age adjusted APACHE III scores (medical metric), issuance of DNR order, morbidity, and mortality. The patients were stratified as >75, and younger. Statistical analysis and model formation was performed. Logistic regression for new DNR order was performed using sex + MOD (Multiple Organ Dysfunction - medical metric) + Age + aAIII (age adjusted APACHE III). The model had a discrimination of 88.9; and goodness of fit of 3.876 (p=.868), implying good calibration. The OR for sex = 2.512, MOD = 1.410, and age = 1.054. DNR order was predominantly predicted by gender and to a lesser extent by MOD and age. Eachampati criticized their own findings for limitation in their data to better explain the gender and age bias. For example they lacked information about advanced directives, and other factors such as family status, and culture. Their gender and age biases are similar to those findings of this dissertation. However, MOD score (a medical metric) was more predictive than age; medical metrics were not more predictive in this study. This may be due to a better medical metric, especially in the age group studied. Interestingly, this thesis shows that family components, for example, marriage and numbers of children are predictive of code status. The closest family relative is most often the spouse, and several studies have suggested that marital status may have substantial impact upon health care received, and even on outcomes. Iwashyna et.al, found that married patients visit higher quality hospitals and may receive better out-patient care; but receive similar quality of care as that of widows and widowers once admitted. (7) Caberera-Alonso et.al, reported the expenditures of the married far outweigh the expenditures of the unmarried; with no differences in the number or types of visits. (8) Married women were found to have earlier breast cancer diagnosis, better treatment, and better survival, independent of any socio-economic or cultural effect. (9) These three studies suggest that there is a more aggressive approach, perhaps more procedural approach for married patients. Iwashyna surmised that this was the result of the improved advocacy of the spouse over the health care worker; but this does not consider the wishes and feelings of the patient him/herself, which may be different in married, compared with unmarried life statuses. In addition, it does not consider the impact of children and extended family. Similarly, gender differences may be extrinsic or intrinsic. Valentin et.al, identified gender bias in ICU resource and invasive procedure allocation; but this does not account for gender differences in sentiment about invasive care. (10) Contrary to these findings are those of de Rooij. (11) de Rooij used recursive partitioning to demonstrate that medical metrics are successful in predicting mortality in the ICU, with age as a feature itself, non-significant. A total of 6,867 consecutive patients 80 years and older from 21 Dutch ICUs were analyzed. Medical metrics included: Glasgow Coma Scale, Acute Physiology and Chronic Health Evaluation II, Simplified Acute Physiology Score II (SAPS II), and Mortality Probability Models II Scores. A recursive partitioning model using all of the medical metrics except SAPS II was developed. The performance of the model was measured by the AUC of the Receiver Operator Curve. The tree identified most patients with high risk of mortality (9.2% versus 8.9% of patients using the tree versus the original SAPS II score, had a risk of 80% or more of mortality. For the age adjusted SAPS II score, 5.9% had 80% or more risk of mortality.) Using 80% as the cut-off point, the positive predictive values were 0.88, 0.83, and 0.87 for the tree, SAPS II, and recalibrated SAPS II. Other than Eachempati's work, evidence of ageism in the ICU seemed to be absent in the literature. Hubbard et.al, performed a cross-sectional study on 4058 patients in South Wales in which he concluded that ageism in access to critical care does not exist. Sick patients in five hospitals were studied every 1 2 th day for one calendar year. Demographic, clinical and physiologic data were collected. Ten members of the Welsh Intensive Care Society studied each case, while blinded to the patient's age. Decisions were made by consensus. Medical conditions included use of the APACHE II Score. The Intensivist group determined that 53% of ward based patients were better suitable for ICU care, and 12.4% of ICU patients were better suitable for ward care. The proportions of those considered to be in inappropriate care settings differed little by age grouping. (12) Methods Classification Algorithm: BoosTexter Classification - BoosTexter is a freely available machine learning classification package. (13) It uses a boosting algorithm to classify text and feature attributes. Specifically, at each point, the algorithm selects the most predictive feature when used in combination with other features, and produces classification errors for that specific feature. It does this by creating a model of prediction, which is un-weighted. Then, misclassified features are evaluated and increasing weights are applied to these values. The algorithm continues for the specified number of iterations. In addition, when classifying text, the use of n-gram may be selected; such that in the case of bigrams, unigrams and bigrams are evaluated, and so forth. Features: Daily ICU nursing social sections; physician hospital discharge summaries; annotated features of physician notes (number of children, living situation, marital status, and employment status); age; gender; medical metrics Medical Metrics - SAPSI(1) (Simplified Acute Physiology Score), is by definition calculated on the first day of ICU admission. In order to augment the medical measures, SAPSI(2) was calculated for day two, and the difference was calculated as the Delta (D) between the two SAPS scores. These three measures were used to quantify the patient's overall medical condition. If the data for SAPS calculation was not available, e.g., in the case of CMO status, then the entry was null. Data Set: Database - MIMIC II Database (an ongoing NIH-sponsored Bioengineering Research Partnership (NIBIB BRP 5RO1EB001659) including investigators at MIT, Philips Medical Systems, and Boston's Beth Israel Deaconess Medical Center.) was used. The database is a repository of information from multiple critical care units. It includes ICU information (observations, measurements, interventions, and ICU daily notes from all services except physicians), and hospital medical information (laboratory medications prescribed, and hospital discharge summaries.) results, The data are de-identified, and reformatted. The database contains information from over 30,000 patient admissions (from over 26,000 unique patients.) Dataset - Data extraction included adults (age greater than 15years) from all critical care areas, and was stratified according to code status. For patients who transitioned from Full Code (FC) status to do-not-resuscitate or do-not-intubate (DNR), or to comfort-measures-only (CMO), the last recorded code status was used. For the purpose of analysis, do-not-intubate (DNI) status was included with DNR status. It is assumed that no significant transition occurred in the reverse direction. Total number of ICU admissions included 17,548 (FC); 2060 (DNR); 784 (CMO). Gender, age, and medical condition were measured. Demographics - Gender was recorded. Age was collected (values greater than ninety, by de-identification convention, are recoded as greater than 200), these values were analyzed as all equivalent to exactly 90. ICU Text - Free text input, from the ICU, included daily notes from all services, except physicians. Text use was limited to the social sections of the nursing progress notes. By convention, this section catalogues family visits; meetings with physicians; and overall understanding of and feelings about the patient's condition. Text entries from all social sections of a single admission were tied to the respective code status and demographic information. Some typical excerpts include: "very supportive family has been in to visit today, wife and children," "family all in agreement that they want him to be extubated and not to be re-intubated, palliation will be main goal if he fails,""family meeting planned," "no family contact this shift." Hospital Text - Free text input, from the hospital stay (including the ICU stay) included physician discharge summaries. Text use was limited to the social sections. By convention, this section catalogues tobacco, ethanol, and illicit drug history. In addition, information regarding the patient's support structure, living circumstances, working situation, family involvement, and any other relevant information to the psychosocial functioning affecting illness and recovery issues. Text entries from all social sections of a single admission were tied to the respective code status and demographic information. Some typical excerpts include: "Denies alcohol or tobacco use. She lives alone. Her son is supportive and lives nearby. She is widowed. He reports having someone who comes by to help with cleaning and being very involved in her care. He contacts her several times a day and takes her shopping. He does her books for her. Although the son does not feel she has significant cognitive difficulties at baseline it is unclear if he has a realistic assessment of her abilities", "Denies EtOH(ethyl alcohol), Tobacco or IDU (intra-venous drug use)", "Denies tobacco or ETOH use. Lives wth husband", "Divorced. Lives with significant other. Drinks 3-4 glasses of wine per week. Works as a physician", "Drank 1.5L of wine per day for 10-15 years; has been abstinent for about one month now; denies tobacoo or drug use; no h/o transfusions; no tattoos; no h/o incarceration or homelessness; no IVDU (intra-venous drug use)". Studies Multiple study analyses were conducted to describe the socio-demographic features of resuscitation patterns: prediction of code class using BoosTexter and nursing social text with data including age, gender, SAPS scores, and the delta; n-gram frequency analysis; and logistic regression on all non-text features. Further analysis included features obtained through annotation of the hospital discharge summaries for: children, living situation, marital status, and employment situation. These features were added to the first set of features in the ablation algorithm, to evaluate the overall code classification rate. As well, the features were analyzed individually for their impact on code status. 1. Nursing Notes as resuscitation code classification Since the nursing social notes contain information largely about visitors, physician/family meetings, and family understanding and sentiment; the notes were evaluated to determine the relative component in which they play in resuscitation code classification. The relative component was especially of interest in comparison to the medical condition and prognosis. Nursing Text Preprocessing - The text entries of the daily notes were preprocessed in the following way: First, the social component was isolated (text was converted to lower case, punctuation and de-identification placeholders were removed), stop words ("and", "the", etc.), rare words (those appearing fewer than 5 times in the entire corpus), and words directly indicating code status ("DNR", "full code", "comfort measures", etc.) were removed. The Porter stemming algorithm was used to stem the remaining words (converting "sons" to "son", etc.). Finally, commonly used abbreviations were added to the stem (e.g. dtr (daughter), was adder to daughter; and dr was added to physician.) Data was randomized into a training set (8 0%) and a test set (2 0%). Algorithm - To accomplish this, a feature ablation study was performed on the data. (14) For each combination of medical features (SAPS scores and delta), BoosTexter was run with and without the social text as a feature. Statistical Analysis - Statistical significance was calculated using McNemar's test on differences in classification error for each combination of features with and without social text. E.g., significance was tested between SAPSI(1) and delta without text and SAPSI(1) and delta with text; and likewise, for all other combinations of medical metrics. 2. N-Gram Analysis of Nursing Social Text - This study was performed to evaluate the most common words and phrases within the nursing text corpus. The processed text was used; however, it was divided into 3 groups, one for each code status. In this format, the social text was a combination of the individual social text strings, liking to a "bag of strings," for each code status. N-gram frequency analysis was calculated for increasing sizes of n. The n-gram count was expressed as a ratio of the count of a particular n-gram to the count of all n-grams of that n group (% of corpus.) Pattern recognitions were further evaluated. Negation algorithms were not used, to help discern the meaning of visit, visitors, no visitors, etc. When patterns were noted in the most frequent n-grams, the %corpus counts were plotted for each code status. 3. Model Formation The third study is a logistic regression calculation of the non-textual attributes using the R statistical framework. The reason for this analysis was to evaluate the features of gender and age in resuscitation code assignment. Each attribute was tested using univariate analysis. A multivariate model was developed by using the features of univariate statistical significance in combination. All possible interaction terms were evaluated. The model with the most possible interaction terms was compared with the main effects model for deviance residuals. The model performance was evaluated using the regression intercept and coefficients obtained on the training set (80%) and applied to the test set (2 0%). A confusion matrix was generated. Age was considered as a continuous variable, as well, as a binary variable at the elderly ranges. The odds ratios, confidence intervals and p-values were calculated. 4. Physician Notes as resuscitation code classification Hospital text preprocessing - the text entries of the discharge summaries were processed as follows: the text was converted to lower case, the punctuation removed, and the de-identifying placeholders were left intact for easier reading during annotation, finally, the social component was isolated. Annotation was performed on 500 entries. The dataset was divided into a training (80%) and test (20%) set. BoosTexter algorithm used the annotation data to classify on the training set (8 0%), and the error rate was noted on the test set (20%). The full corpus was then labeled automatically using the learned classifiers. Classification was done first for the annotation features, followed by classification for code status. The fourth study is an analysis of classification accuracy of resuscitation code using BoosTexter and feature ablation, as in the first study. The full sets of features, including the nursing text, the medico-demographic features, the physician text, and the annotated features. The individual feature performance was analyzed, as well as features of the top performing groups of features. 5. Physician Notes annotation analysis The fifth study evaluates the individual annotation features in a univariate analysis, allowing comparative contributions from each feature. annotation error rates are noted. The Annotation evaluation is by chi square comparison of code distribution for each annotated featured with the distribution of code status in the entire corpus. A gender sub-analysis was performed, followed by an age/gender sub-analysis. Results Data Distribution The data was distributed as follows: code status (FC = 17,548; DNR = 2,060; CMO = 784); gender (males = 11,508; females = 8,884); age (range = 15 - >90; mean = 63.46; median = 65.43); SAPS1(dayl) (range = 0 - 37; mean 13.45; median 13); SAPS1(day2) (range = 0 - 41; mean 11.86; median 12); DELTA (range = -19 - 25; mean -2.33; median -2). 1. Nursing Notes as resuscitation code classification BoosTexter Classification Figure 1 represents code status classification error as computed by BoosTexter for all ablations of medical metrics with and without social text as a feature. Statistical significance at all medical metrics was demonstrated with p < 0.001. In each case, the notes had a profound effect on classification accuracy. This may imply that the text contains more information relevant to determining code status than medical condition does. Since the text primarily consists of a record of social visits and meetings of the physician with the family, there may be a correlation between the number or type of visitors and the code status. family or physician sentiment. This could possibly represent an effect of 0.142 0.14 0.138 0.136 0.134 0.132 0.13 SAPSI(1,2), D SAPSI(1,2) SAPSI(1), D SAPSI(2), D 0 without text SAPSI(1) SAPSI(2) Delta a with text Figure 1: BoosTexter error rate with varying combinations of medical metrics; impact of text demonstrated. Difference at each combination is significant with p < 0.001. Figure 2 demonstrates the classification errors for each feature individually. The lowest error univariate error rate is found using the nursing notes' social text. Surprisingly, social text, gender, and age are all more informative for classification than the medical metrics provided by the SAPS scores. 0.17 0.165 - 0.16 0.155 0.15 0.145 0.14 0.135 0.13 notes age gender SAPS day 1 SAPS day 2 delta Figure 2: BoosTexter error rate with single metrics; non-text and nursing notes data set. The lowest classification error rate is shown in Figure 3 - trigram, 500 iterations (Appendix). Note, that in this overall feature ablation study, the top features all include the feature "n" (nursing notes). This further supports the role of the nursing social notes as the most important feature. The relatively low training error may reflect some over training. 2. N-Gram Analysis Figure 4 demonstrates the frequency of gender- specific words for spouse, parent, and child in the social text as a ratio of word count / total words in the corpus. Comparisons were made within code groups and between groups. There was a marked gender difference in each case across all code statuses. This suggests that there is more daily support from female relatives while in the ICU. Table 1 - Top Unigrams without stop words famili visit wife daughter son husband sister mother question doctor friend brother children visitor father parent niece nephew CMO DNR Full Code Unigram % Unigram % Unigram Count Corpus Count Corpus Count 23477 23357 13724 11689 8176 5876 4794 4597 3954 3350 2666 2646 2137 2100 1666 1345 396 355 2.471536 2.458903 1.444791 1.230557 0.860727 0.618595 0.504687 0.483948 0.416257 0.352671 0.280663 0.278557 0.224972 0.221077 0.175388 0.141595 0.041689 0.037373 2.824392 1.696213 0.89057 1.419621 1.034899 0.327177 0.434843 0.257565 0.284017 0.530908 0.165213 0.196306 0.155931 0.13876 0.107203 0.024132 0.069148 0.065435 6086 3655 1919 3059 2230 705 937 555 612 1144 356 423 336 299 231 52 149 141 3815 2070 1208 1419 1062 645 511 293 402 755 228 333 233 138 91 66 %Corpus 3.084773 1.673782 0.976777 1.14739 0.858723 0.521541 0.41319 0.236917 0.325053 0.610486 0.184359 0.269261 0.188402 0.111585 0.073582 0.053367 X X 0.03 0.025 0.02 0.015 0.01 0.005 0 CMO S wife FC DNR/DNI m husband E daughter N son U mother father Figure 4: Most frequent visitors by percentage of total corpus, stratified by gender counterparts for each code status of patient. Appendix (Figures 5, 6, and 7: Daughter more frequent than Son across all code statues; Wife more frequent than Husband across all code statuses; Mother more frequent than Father across all code statuses) 0.4 0.35 0.3 0.25 0.2 * no contact * no visitor 0.15 0.1 0.05 0 DNR CMo Figure 8: Bigram study, less contact, less visitors as resuscitation Status increases Figure 8, demonstrates the % corpus of the two bigrams, "no contact," and "no visitor." No contact and no visitors was observed in the dataset more frequently for the FC group, and next was the DNR group, followed by the CMO group. Fewer visitors, at FC status may be a reflection of better general health in the FC group, compared with the DNR and CMO groups. Family and friends may be more inclined to visit when they know a prognosis may be grave. Alternatively, or additionally, there may be something intrinsically different about groups in which there loved one is classified as a lessor resuscitation status. 3. Model Formation - Table 2 shows the univariate analysis of the features considered for model formation. SAPS1(dayl) was the only metric used since it is a standard metric utilized in many ICU's, and since it was the most predictive medical metric when using BoosTexter classification. All features were highly significant, upon univariate analysis. Age was converted to binary groups beginning at age 65 and continuing to age 80 (Maximum age in the corpus is age 90.), since the elderly are considered separately in the literature. Table 2 Independent Variable AIC GENDER MALE 10492 AGE 11765 p<.001 SAPS1 12018 p<.001 CHI SQUARE 135.6576 p value p<.001 AGE>=65 744.5943 p<.001 AGE>=70 933.0086 p<.001 AGE>=75 1177.781 p<.001 AGE>=80 1368.453 p<.001 Table 3 - illustrates the process of model formation for prediction of code status (FC vs DNR), using the main effects of Age, Gender, and SAPS1. All interactions are included in the analysis; three-way, and two-way. There were 2 two-way interactions noted to be significant and with a non-zero coefficient; AGE:GENDER and GENDER:SAPS1. The three-way interaction was not significant. The model including the main effects and the two interactions was compared with the model using the main effects only. 30 There was a clear effect upon the coefficients of the main effects. Therefore, an analysis of Deviance residuals was performed (Data in Tables 4, 5, Appendix.) Ireg(CODE~AGE*GENDER*SAPS1,data) OR low.95 high.95 9.6 14722.26 3523.08 61521.43 2.22E-16 AGE -0.1 0.91 0.89 0.92 2.22E-16 GENDERM -0.4 0.67 0.09 4.82 0.69004 -0.23 0.8 0.73 0.87 7.24E-07 0.02 1.02 0.99 1.04 0.21551 0 1 1 1 9.62E-05 -0.02 0.98 0.87 1.1 0.7189 0 1 1 1 0.90175 Coef (Intercept) SAPS1 AGE:GENDERM AGE:SAP1 GENDERM:SAP1 AGE:GENDERM:SAP1 p-val Ireg(CODE~-AGE+GENDER+SAPS1+AGE:GENDER p-val +AGE:SAP1+GENDER:SAP1,data) Coef OR low.95 high.95 (Intercept) 9.54 13867.43 4768.09 40331.79 2.22E-16 -0.1 0.91 0.9 0.92 2.22E-16 GENDERM -0.29 0.75 0.36 1.57 0.44800021 SAPS1 -0.22 0.8 0.75 0.85 6.60E-12 0.02 1.02 1.01 1.02 0.00075665 0 1 1 1 4.84E-08 -0.03 0.97 0.95 0.99 0.00749844 AGE AGE:GEN DERM AGE:SAP1 GENDERM:SAP1 Ireg(CODE~'AGE+GENDER+SAPS1+AGE:GENDER +GENDER:SAP1,data) (Intercept) Coef OR low.95 high.95 p-val 7.05 1158.5 696.36 1927.35 2.22E-16 AGE -0.06 0.94 0.93 0.94 2.22E-16 GENDERM -0.24 0.78 0.39 1.58 0.4968832 SAPS1 -0.05 0.95 0.94 0.97 8.94E-11 AGE:GENDERM 0.02 1.02 1.01 1.03 0.0002529 GENDERM:SAP1 -0.04 0.96 0.94 0.98 0.00044485 Ireg(CODE~AGE+GENDER+SAPS1 +GENDER:SAP1,data) Coef OR low.95 high.95 p-val 6.5 667.22 451.76 985.44 2.22E-16 -0.06 0.95 0.94 0.95 2.22E-16 0.89 2.45 1.72 3.47 5.82E-07 SAPS1 -0.05 0.95 0.93 0.96 3.52E-12 GENDERM:SAP1 -0.03 0.97 0.95 0.99 0.0026528 (Intercept) AGE GENDERM Ireg(CODE~AGE+GEN DER+SAPS1,data) Coef (Intercept) AGE GENDERM SAPS1 OR low.95 high.95 p-val 6.75 851.8 594.25 1220.97 2.22E-16 -0.06 0.95 0.94 0.95 2.22E-16 0.38 1.47 1.32 1.64 7.86E-12 -0.07 0.93 0.92 0.94 2.22E-16 Ireg(CODE~GENDER+SAPS1,data) Coef OR low.95 high.95 p-val (Intercept) 3.28 26.69 22.52 31.62 2.22E-16 GENDERM 0.53 1.69 1.52 1.88 2.22E-16 SAPS1 -0.1 0.9 0.9 0.91 2.22E-16 (Appendix: figures 9, 10, and 11: The ROC Curve for Training Set, AUC 0.758; The ROC Curve for Test Set, AUC = 0.784; Error Rate based on Confusion Matrix of Mode) = 10 Ct) o ~ 2 C:)2 0 0 Cf) 12 Fiur tes AU -ws wsmagnal Ero aeo ae reito imrvdoe h rinn netSt=.0 Uwihio terribly meaningful, since the margin of difference is very small. However, in general the AUC is usually highest on the training set. A value greater than 0.7 is generally considered pretty good for health outcomes. The goodness of fit Akaike Information Criterion (AIC) is a test which penalizes superfluous parameters in the model. The higher the AIC, the poorer the fit associated with the model. Although the model has a AIC of 8803.2 (which is considered a high number), all three of the model 33 components in univariate analysis have values of AIC, which are even higher (Gender = 10492, Age = 11765, SAPS1 = 12018.) This would indicate that gender, continuous Age, and SAPS1 do not give a good fit individually. However, in Chi Square analysis, binary ages near the median of the data set (65) and higher are associated with increasingly higher Chi Square scores. Hosmer-Lemeshow test of calibration is poor, with p = 0. This is may be due to the profound differences seen in DNR distribution as age changes. Despite poor calibration, the discrimination is very good, and the error rate of classification is 0.103. This is substantially lower than that of any of the BoosTexter analyses. Table 7: Odds ratios of female and older patients based on the logistic regression model, computed separately for FC vs. DNR (CMO ignored) and FC vs. DNR/CMO Odds Ratio Conf. Intervals p-value Gender = Female FC vs. DNR 1.47 1.31-1.60 FC vs. DNR/CMO Age : 70 1.35 1.24-1.47 < 0.001 < 0.001 FC vs. DNR 3.72 3.35-4.13 < 0.001 FC vs. DNR/CMO 2.91 2.66-3.18 < 0.001 3.62-4.87 < 0.001 2.95-3.82 < 0.001 2.65-3.58 < 0.001 2.12-2.72 < 0.001 Age 2: 70 and Gender =Female FC vs. DNR 4.19 FC vs. DNR/CMO 3.36 Age 2: 70 and Gender = Male FC vs. DNR 3.08 FC vs. 2.40 DNR/CMO Upon model formation and testing, the logistic regression model was used to investigate the role of gender and age in code assignment (Table 6.) Odds Ratios are more marked when comparing FC to DNR than that of FC to DNR/CMO. There is evidence of statistically and clinically relevant gender bias and even greater ageism. 4. Hospital Discharge Summary BoosTexter Classification 0.17 0.165 0.16 0.155 0.15 0.145 0.14 0.135 0.13 C, 40 Figure 13: BoosTexter error rate with single metrics; non-text and nursing notes (black), physician notes and annotated physician note data set (grey). Compared with the first feature ablation study, the physician social text is less predictive than all but SAPS1 day2 and delta. The annotated features are more predictive than even SAPS1, which is one of the main effects used in the model to control for medical condition. So, although the physician discharge summaries are not particularly helpful in univariate analysis, the annotated features are. No one annotated feature group in more predictive than another. However, in combination, the added features do contribute significantly to code classification rates, resulting in the lowest overall error produced by feature ablation (0.1299338.) Figure 14 - Classification error using feature ablation, 500 iterations. Black = inclusive of physician notes and annotated features. Light Grey = not inclusive of physician notes or annotated features. (g=gender, a=age, 1=SAPSI (dayl), 2=SAPSI(day2), d=delta, n=nursing notes, p=physician notes, c=children, l=living situation, m=marital status, w=working status) The training error is similarly lower as in the first ablation study; however, with 250 iterations, the training error increases without adding much to the classification error. Nursing notes remain the most consistent features, being present in over the first several tens of the combination studies. 5.-Hospital Discharge Summary Annotation Children 0= 664 1 = 1121 M = 2104 U =16503 Living Situation A= 723 F=5039 1=1096 Marital Status D= 583 M = 4837 S = 391 Working Situation R= 1502 W = 1244 WC = 487 U= 13534 U= 14039 BC= 425 D = 361 Other = 563 U = 15810 Table 8 - distribution of annotation FEATURE ERROR Children # Marital Status Living Situation WorkingSituation 0.1235955 0.0449438 0.1348315 0.1235955 Table 9- annotation feature classification error The text was successfully annotated, with the least classification error for marital status, 0.0449438. The feature annotation rate ranged from 19 33.6% of the corpus. Annotation beyond 500 cases would not likely improve this rate; as many of the notes were missing, and at least half dealt only with alcohol, tobacco, and illicit drug use. Despite this, much information was learned from the annotation. Number of Children - (Tables 10, 11, 12) Zero or one child was found to be statistically significantly (p<.001) associated with lower rates of FC status, and higher rates of DNR status. Children = Many, did not differ from the corpus distribution of code status. Children by Gender - When evaluating the effects of children on parent's code status; the findings are the same, except for males the zero child effect holds only for women. Children - bv Gender and Aae >= 70 or Aae<70 - The zero child impact on code status for women holds true when looking at age effect. Living Situations - (Tables 13, 14, 15, Appendix) Living Alone (p=.004) or in "Institution" (p<.001), were both associated with less FC, and more DNR statuses. (Institution living includes nursing home, assisted living arrangements, rehabilitation facilities, and other similar situations.) However, living with Family was statistically significantly (p=.044) associated with increased FC and decreased DNR statuses. Living Situation by Gender - Gender differences were evident in Living Alone, with only female significant. Living Situation by Gender and Age Age effect noted in Living Alone, with only older group statistically significant. Family and Institution findings not gender or age dependent. Marital Status - (Tables 16, 17, 18, Appendix) Marital Status = Widow or Widower, is statistically significantly (p<.001) found to be associated with less FC and more DNR. Although not statistically significant (p=.174), Divorced Status follows the same trend. However, Married Status is associated with more FC and less DNR. Single status is not significantly different from the corpus. Marital Status by gender Only Marital Status value, Married, differed by gender. The female group was statistically significant, the male group was not. Marital Status by Gender and Age Numbers too small to evaluate. Working Situation - (Tables 19, 20, 21, Appendix) Working Situation = Working has lower FC rates, and higher DNR rates (p<.001.) Although not statistically significant, Working = White Collar (p<.088) and Working = Retired (p=.082), have similar trends. Unemployed status is associated with more FC and less DNR (p=.026.) Working = Blue Collar, Disabled, or Volunteer are not significantly different from the rates in the whole corpus. Working by Gender - Gender effect is noted in value retired, significant for male only, p=.004. Also, gender effect is noted for unemployed, significant for female only, p=.019. Working by Gender and Age Age difference noted for Retired young male only p=.031. Discussion Understanding resuscitation code assignment in the ICU is complex, and this study lends a preliminary attempt at this need. It is vital to understand the factors which influence code assignment in order to ensure ethical treatment of all patients, and provide treatment in harmony with patients' wishes. Code status seems to be more a reflection of family and physician sentiment and assessment, than of unprejudiced medical measure. This, unfortunately, can leave room for human mistakes in interpreting medical condition, prognosis, and individuals' wishes. Certainly, as mentioned in the introduction, it is not easy to simply encourage advanced directives and carry them out. There is no room to anticipate every medical scenario. The main concern is that in this process, individual care takers and family may introduce unwanted or un-indicated bias. Some perceived bias, may indeed be socio-demographically intentionally derived from the patients' wishes. Discerning this component is impossible, as patients are most often too sick to communicate for an interview, and retrospective analysis is often erred. It is, however, clear from these analyses that medical condition and prognosis, alone, are not likely the leading driving factor of code assignment; and in fact these findings mirror those associated with ICU resource allocation itself. That is to say, gender and age bias, major influencing factors in code assignment found in this study, are also major influencing factors in ICU bed assignment. Valentin et.al, found overwhelming evidence, that women were less likely than men to be admitted to the ICU when severity of illness was considered. (10) In addition, the women were much less likely to undergo invasive procedures. This finding is in contrary to that of Perkins et.al, who found that women are more open to invasive treatments. (15) These conflicting findings may be further evidence that there is gender bias when treating women in the ICU, if in fact women are more open than men, to undergoing invasive treatments; and by extension, more open to more aggressive code status. However, Covinsky et.al, (16), as part of the SUPPORT project, found women less likely to want CPR; and Raine et.al, found gender bias toward men and women, depending on the diagnosis. (17) Race has in the past been shown to affect code status, although this is not clear. Bardach et.al, found that women and Hispanics were more likely to have DNR order, and when adjusted for, hospital mortality rates reversed the advantage to Hispanics. (18) Contrarily, Shepardson et.al, found differences in rates of the DNR order in African American compared with Caucasians; with rates higher in Caucasian patients. (19) Unfortunately, this study was not able to extract information about race. This is the first study, known to the author, to evaluate prediction of code status. This study included comparative classification using feature ablation of nursing social notes (largely, a log of family visits, feelings, and family/physician meetings), age, gender, and SAPSI score. The nursing notes alone proved to be a better indicator of code status than the available medical statistics. Age and gender were also highly predictive. When combined with medical features, social text improves classification accuracy remarkably. Additionally, the social section of physician hospital discharge summaries (largely a log of marital status, children involvement in care, living and employment situations, as well as history of alcohol, tobacco, and illicit drug use) were used for comparative classification. Initially, it was noted that the physician notes were not terribly useful as free text alone; and therefore, an annotation was performed. The annotation features (children, living situation, marital status, and employment status) were found to be less predictive individually, than nursing text, age or gender; however, they were more predictive than the SAPSI score. In combination with other features, the annotation features reduced the classification error even further. However, the least prediction error was achieved with the logistic regression model. The model was tested for interaction terms, and for performance. The precision of the model was excellent, with an AUC (C index) of 0.784. Calibration, however, was poor. Using the model, an assessment of gender (OR=1.47) and age (OR=3.72) treatment was made, using the Odds Ratios of likelihood to be of less aggressive code status. There was found, notably clear, gender bias toward men, and ageism favoring the young. The gender bias findings are consistent with those of Eachempati, et al, who found a gender bias in DNR assignment for elderly patients undergoing emergency surgery. (1) The gender difference is especially concerning, given the study by Zettel-Watson et al. (20) In this study, wives were found to be more accurate compared with husbands regarding their spouse's wishes. It seems equally plausible, therefore, that the gender difference may be a reflection of the cultural devaluation of women compared with men. The remarkable age differences are inconsistent with the lack of ICU ageism reported by Hubbard et. al. (12) The n-gram study of the nursing social text revealed interesting specific differences in gender involvement. It cannot readily be concluded that the female gender (wife, daughter, sister) has a more decisive role compared with the male gender (husband, son, brother); but it is reasonable to conclude that there is more daily female support. There does not seem to be bias in this regard to any specific code group, as the relationship is consistent across all code states. The findings of the bigram study are not surprising. There are more visitors, and more contact as code status progresses from FC to DNR to CMO. It is likely that family and loved ones are more involved when death seems imminent or more likely. The annotated features of the physician hospital discharge summaries revealed vulnerable groups. For children: having zero children (females only) or one child was associated with decreased rates of FC status, and increased rates of DNR status; having many children did not differ from the baseline rates in the full corpus. For living situation: being institutionalized (nursing home, rehabilitation facility, assisted living, and others) was associated with decreased rates of FC and increased rates of DNR statuses; the same trend was observed for living alone (for older females only), so this is likely a simple age effect; living with family was associated with increased FC rates and decreased DNR rates (no age or gender effect). For marital status: being a widow or widower was associated with decreased rates of FC, and increased rates of DNR, with divorce status following this trend; being married and female had the reverse observation; married males and single people did not differ from the underlying corpus. For employment status: the values working or white collar were both associated with decreased FC, and increased DNR; retired followed this pattern for men only; unemployed females were observed to follow the opposite direction, increased FC, and decreased DNR; blue collar, disabled, and volunteer did not differ from the corpus. A significant limitation of the BoosTexter classification and feature ablation studies, was the limitation of the SAPSI score to define medical condition and prognosis. SAPSI may not be the most robust medical metric for all medical conditions, especially those related to multiple organ failure. SAPSII (21) (which includes a parameter for ICU indication for admission), the APACHE scores, and multiple organ failure scores are established; but the MIMIC database does not support all the parameters. For this reason, the parameters SAPSI (day 2), and Delta, were created; however, they did not prove to be as predictive as the standard SAPSI score. Lack of Alternative medical metrics, were probably the most significant limitation of the classification studies, including the logistic regression model. In fact, the Eachempati study used both the MOD and aAIII, which may have been helpful in creating a more calibrated model than that of this thesis. The MOD score was also more predictive, in that study, than age in determining code status. (1) Another limitation to the classification studies, was the overwhelming DNR class in the corpus. The training set was 86.213449% FC, and the test set was 85.43761% FC. This makes classification errors in the 13 - 14 range difficult to interpret. The annotated features were limited by numbers, in some cases. Increased annotation would not likely correct this problem, since many notes were limited to alcohol, tobacco, and illicit drug use. Improvements and future study will include: annotation of the nursing notes, modifiers to the logistic regression model to help increase the calibration; inclusion of more medical metrics, as the MIMIC database allows; and the addition of racial, and more socio-demographic information when available. Summary These findings highlight several points. First, there is a need for improved communication between health care providers and family members. Their involvement may clarify the patient's potential to respond to further therapy, thereby helping accurate code status to be applied more quickly. Second, there is decidedly more daily involvement from family members of female gender for patients of any code status; however, the significance of this is unclear. There is a need for more support and advocacy for- some of the most vulnerable patients (Those patients: with one or less child; living alone or institutionalized; widows, widowers, and divorced; retired; former or present white collar work; and working), and of course women and the elderly. Finally, the gender and age differences in less aggressive code statuses warrant in-depth further study. References 1. Eachempati SR, et al. Sex differences in creation of do-notresuscitate orders for critically ill elderly patients following emergency surgery. Journal of Trauma-Injury Infection & Critical Care. 2006; 1:193-7. 2. Matsui M, et al. Perspectives of elderly people on advance directives in Japan. Journal of Nursing Scholarship. 2007; 2:1726. 3. Rosnick CB, et al. Thinking ahead: factors associated with executing advance directives. Journal of Aging &Health. 2003; 2:409-29. 4. Watson DR, et al. The effect of hospital admission on the opinions and knowledge of elderly patients regarding cardiopulmonary resuscitation. Age & Ageing. 2007; 6:429-34. 5. Joos SK, et al. Outpatients' attitudes and understanding regarding living wills. Journal of General Internal Medicine. 1993; 5:259-63. 6. Lester PE, et al. Do Geriatricians Practice What they Preach?: Geriatricians' personal establishment of advance directives. Gerontology & Geriatrics Education. 2009; 1:61-74. 7. Iwashyna TJ, et al. Marriage, widowhood, and health-care use. Social Science & Medicine. 2003; 57:2137-2147. 8. Cabrera-Alonso J, et al. Marital Status and Health Care Expenditures Among the Elderly in a Managed Care Organization. Health Care Manager. 2003; 22:249-255. 9. Osborne C, et al. The influence of marital status on the stage at diagnosis, treatment, and survival of older women with breast cancer. Breast Cancer Research and Treatment. 2005; 93:41-47. 10. Valentin A, et.al. Gender-related differences in intensive care: A multiple-center cohort study of therapeutic interventions and outcome in critically ill patients. Crit Care Med. 2003; 31:19011907. 11. de Rooij, et al. Identification of high-risk subgroups in very elderly intensive care unit patients. Critical Care. 2007; 11:1-9. 12. Hubbard RE, et.al. Absence of ageism in access to critical care: a cross-sectional study. Age Ageing. 2003; 32:382-7. 13. Schapire RE, et al. BoosTexter: A Boosting-based System for Text Categorization. Machine Learning. 2000; 39:135-168. 14. Walker MA, et al. Empirical Studies in Discourse. Association for Computational Linguistics. 1997; 23:1-12. 15. Perkins HS, et al. Advance care planning: does patient gender make a difference? American Journal of the Medical Sciences. 2004; 1:25-32. 16. Covinsky KE, et al. Communication and decision-making in seriously ill patients: findings of the SUPPORT project. The Study to Understand Prognoses and Preferences for Outcomes and Risks of Treatments. Journal of the American Geriatrics Society. 2000; (5 Suppl):S187-93 17. Raine R, et al. Influence of patient gender on admission to intensive care. J Epidemiol Community Health; 56:418-423. 18. Bardach N, et al. Adjustment for do-not-resuscitate orders reverses the apparent in-hospital mortality advantage for minorities. American Journal of Medicine. 2005; 4:400-8. 19. Shepardson LB, et al. Racial Variation in the Use of Do-NotResuscitate Orders. J Gen Intern Med; 14:15-20. 20. Zettel-Watson, et.al. Actual and perceived gender differences in the accuracy of surrogate decisions about life-sustaining medical treatment among older spouses. Death Studies. 2008; 3:273-90. 21. LeGall JR, et.al. A new simplified acute physiology scores (SAPSII) based on a European/North American multicenter study. JAMA. 1993; 270:2957-2963. Biographical Note and Acknowledgement Regina Barzilay, Ph.D. Mitchell Medow, M.D., Ph.D. Robert Friedman, M.D. Alexa McCray, Ph.D. Roger G. Mark, M.D., Ph.D. William J. Long, Ph.D. Christina J. Sauper, S.M. Mauricio Villarreol, Ph.D. Daniel Scott, Ph.D. Michael Craig, Ph.D. Biographical Note - I received my BS degree from George H Cook College four year honors program, in biochemistry, and my MD from Cornell University Medical College. I trained in general surgery under G. Tom Shires II, MD. I completed my Obstetrics and Gynecologic Oncology training at Harvard's Brigham and Women and Massachusetts General Hospitals. Finally, my clinical training was completed with a fellowship in Gynecologic Oncology at The James Graham Brown Cancer Center. Appendix Figures (3, 5, 6, 7, 9, 10, 11) 0.16 1 0.14 - 0.12 - 0.1 - 0.08 - Mtest error I training error 0.06 - 0.04 0.02 - 0- Figure 3 I I I I I ) I I I I I I I I I I I I I 0.03 0.025 0.02 C 0 0.015 R P 0.01 U S 0.005 0 DNR/ DNI CMO 0 daughter 0 st Figure 5 0.03 0.025 0.02 c R P U S 00.015 0.01 0.005 0 CMO DNR/DNI U wife Figure 6 a husband FC 0.012 0.01 C- 0.008 0 0.006 R 0.004 U S 0.002 0 CMO Figure 7 DNR/DNI s mother father FC C) 0 (D C) 0 ID It CJ 0 0.0 0.2 0.4 0.6 False positive rate Figure 9 0.8 1.0 0D 0 | C) 04 0) 0 C) False positive rate Figure 10 2 C0 0 0 1 2 3 Cutoff Figure 11 4 5 6 Tables (3, 4, 9 - 20) model1=glm(CODE~AGE+SAP1+GENDER,binomial) > summary(modell) Call: glm(formula = CODE ~ AGE + SAP1 + GENDER, family = binomial) Deviance Residuals: Min 1Q Median 3Q Max -3.2278 0.2027 0.3449 0.5302 1.3922 Coefficients: Estimate Std. Error z value Pr(> Iz) (Intercept) 6.747351 0.183697 36.731 < 2e-16 AGE -0.055159 0.002265 -24.351 < 2e-16 *** SAPI -0.068940 0.005571 -12.374 < 2e-16*** GENDERM 0.384418 0.056192 6.841 7.86e-12 * Signif. codes: 0 "***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1'' 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 10063.0 on 14799 degrees of freedom Residual deviance: 8803.2 on 14796 degrees of freedom (886 observations deleted due to missingness) AIC: 8811.2 Number of Fisher Scoring iterations: 6 Table 3 Model 2 - Table 4 > model2<-glm(CODE~AGE+SAP1+GENDER+AGE:GENDER+SAP1:GENDE R,binomial) > summary(model2) Call: glm(formula = CODE ~ AGE + SAP1 + GENDER + AGE:GENDER + SAP1:GENDER, family = binomial) Deviance Residuals: Min 1Q Median 3Q Max -3.1550 0.2039 0.3453 0.5221 1.3136 Coefficients: Estimate Std. Error z value Pr(>IzI) (Intercept) 7.054885 0.259703 27.165 < 2e-16 -0.062976 0.003211 -19.612 < 2e-16*** AGE -0.050393 0.007772 -6.484 8.94e-11*** SAP1 -0.244020 0.359169 -0.679 0.496883 GENDERM AGE:GENDERM 0.016639 0.004547 3.659 0.000253 * SAP1:GENDERM -0.039157 0.011150 -3.512 0.000445 * Signif. codes: 0 "'***' 0.001'**' 0.01 '*' 0.05 '.' O.1'' 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 10063.0 on 14799 degrees of freedom Residual deviance: 8780.8 on 14794 degrees of freedom (886 observations deleted due to missingness) AIC: 8792.8 Number of Fisher Scoring iterations: 6 Table 4 Significance for feature children, value 0: data: contingency table 1 2 3 A B 31 784 815 94 2060 2154 539 17548 18087 664 20392 21056 expected: contingency table 1 2 3 A 25.7 67.9 570. B 789. 2.086E+03 1.752E+04 chi-square = 13.2 degrees of freedom probability = 0.001 = 2 Significance for feature children, value 1: data: contingency table 1 2 3 A B 57 784 841 176 2060 2236 888 17548 18436 1121 20392 21513 expected: contingency table 1 2 3 A 43.8 117. 961. B 797. 2.119E+03 1.748E+04 chi-square = 42.0 degrees of freedom probability = 0.000 = 2 Significance for feature children, value Many: data: contingency table B A 1 79 784 863 2 214 2060 2274 3 1811 17548 19359 2104 20392 22496 expected: contingency table A B 782. 1 80.7 2 213. 2.061E+03 3 1.811E+03 1.755E+04 chi-square = 0.493E-01 degrees of freedom = 2 probability = 0.976 Table 9 Significance for feature children, value 0 and Male: data: contingency table A B 1 9 407 416 2 31 917 948 3 278 10184 10462 318 11508 11826 expected: contingency table A 1 11.2 2 25.5 3 281. chi-square B 405. 923. 1.018E+04 1.70 degrees of freedom = probability = 0.427 2 Significance for feature children, value 0 and Female: data: contingency table 1 2 3 A B 22 377 399 63 1143 1206 261 7364 7625 346 8884 9230 expected: contingency table 1 2 3 A 15.0 45.2 286. B 384. 1.161E+03 7.339E+03 chi-square = 13.0 degrees of freedom = probability = 0.002 2 Significance for feature children, value 1 and Male: data: contingency table 1 2 3 A B 20 407 427 46 917 963 352 10184 10536 418 11508 11926 expected: contingency table 1 2 A 15.0 33.8 B 412. 929. 3 369. 1.017E+04 chi-square = 7.20 degrees of freedom = probability = 0.027 2 Significance for feature children, value 1 and Female: data: contingency table 1 2 3 A B 37 377 414 130 1143 1273 536 7364 7900 703 8884 9587 expected: contingency table 1 2 3 A 30.4 93.3 579. 8 384. 1.180E+03 7.321E+03 chi-square = 20.6 degrees of freedom = probability = 0.000 2 Significance for feature children, value Many and Male: data: contingency table 1 2 3 A B 34 407 441 105 917 1022 938 10184 11122 1077 11508 12585 expected: contingency table 1 2 3 A 37.7 87.5 952. B 403. 935. 1.017E+04 chi-square = 4.47 degrees of freedom = probability = 0.107 2 Significance for feature children, value Many and Female: data: contingency table 1 2 3 A B 45 377 422 109 1143 1252 873 7364 8237 1027 8884 9911 expected: contingency table 1 2 3 A 43.7 130. 854. B 378. 1.122E+03 7.383E+03 chi-square = 4.23 degrees of freedom = probability = 0.120 Table 10 2 vl t ............... .. I.. -tuft, hildr o"J" a n4 Few a106 77 T, 2so t Q 7777 & conPngenlcy, tWe -,A list llilk 1 ?4A, 2 987E +03; 0 .177 trot I'-ftwuk4ii 'Chilrd lu v"" va fNevpwl 0 f4i IMPS4 Role 77 ORO, _IZ "j-jL3rj4 JA491. t 7f"'4360 4531 4758 44t6 A', 65 genist!abte :M ,4-6-.m i mmen 2Na~W m yeiaiekata 332eyaags+t3m %%$10%(am ja66 -lop% em too", ta 767 799 d7 . 0' ^4.-JL1 f 'i ~ -i j d rn - a 1-- --- h~j th68 --A Table 11 Significance for feature living, value Alone: data: contingency table A B 1 2 32 99 784 2060 3 592 17548 18140 816 2159 723 20392 21115 expected: contingency table A 1 2 3 27.9 73.9 621. B 788. 2.085E+03 1.752E+04 chi-square = 10.8 degrees of freedom probability = 0.004 = 2 Significance for feature living, value Family: data: contingency table A B 1 184 784 968 2 453 2060 2513 3 4402 17548 21950 5039 20392 25431 expected: contingency table A B 776. 1 192. 2 498. 2.015E+03 3 4.349E+03 1.760E+04 chi-square = 6.25 degrees of freedom probability = 0.044 = 2 Significance for feature living, value Institution: data: contingency table 1 2 3 B A 61 784 845 250 2060 2310 785 17548 18333 1096 20392 21488 expected: contingency table 1 2 3 A 43.1 118. 935. B 802. 2.192E+03 1.740E+04 chi-square = 189. degrees of freedom probability = 0.000 = 2 Table 12 Significance for feature living, value Alone and Male: data: contingency table 1 2 3 A 11 B 407 418 33 917 950 292 10184 10476 336 11508 11844 expected: contingency table A B 1 11.9 406. 923. 2 27.0 1.018E+04 3 297. chi-square = 1.55 degrees of freedom probability = 0.460 2 Significance for feature living, value Alone and Female: data: contingency table A 1 2 3 B 21 377 398 66 1143 1209 300 7364 7664 387 8884 9271 expected: contingency table A B 1 16.6 381. 1.159E+03 2 50.5 7.344E+03 3 320. chi-square = 7.49 degrees of freedom = 2 probability = 0.024 Significance for feature living, value Family and Male: data: contingency table A B 1 120 407 527 2 286 917 1203 3 3112 10184 13296 3518 11508 15026 expected: contingency table A B 1 123. 404. 921. 2 282. 3 3.113E+03 1.018E+04 chi-square = 0.209 degrees of freedom = 2 probability = 0.901 Significance for feature living, value Family and Female: data: contingency table A 1 64 2 167 3 1290 B 377 441 1143 1310 7364 8654 1521 8884 10405 expected: contingency table A B 377. 1 64.5 2 191. 1.119E+03 3 1.265E+03 7.389E+03 chi-square = 4.25 degrees of freedom = 2 probability = 0.119 Significance for feature living, value Institution and Male: data: contingency table 1 2 3 A B 26 407 433 82 917 999 384 10184 10568 492 11508 12000 expected: contingency table 1 2 3 A 17.8 41.0 433. B 415. 958. 1.013E+04 chi-square = 52.7 degrees of freedom = probability = 0.000 2 Significance for feature living, value Institution and Female: data: contingency table 1 2 3 B A 35 377 412 168 1143 1311 401 7364 7765 604 8884 9488 expected: contingency table 1 2 3 A 26.2 83.5 494. B 386. 1.228E+03 7.271E+03 chi-square 113. degrees of freedom = probability = 0.000 Table 13 2 -va e A *b-Vide dava'as tontiWq 1 74 ft~e 4lO U~qec tam basy 5875 a - i - flffca'm- -- - 1- e-- Nr-5 24 ? j' 76 13 .2~ 19-5-: 290 1 ~ly 3 va 2,24 _41 _77 1" 0 Table 14 Significance for feature married, value Divorced: data: contingency table A 1 2 3 B 28 784 812 69 2060 2129 486 17548 18034 583 20392 20975 expected: contingency table 1 2 3 A 22.6 59.2 501. B 789. 2.070E+03 1.753E+04 chi-square = 3.50 degrees of freedom probability = 0.174 = 2 Significance for feature married, value Married: data: contingency table A B 1 169 784 953 2 376 2060 2436 3 4292 17548 21840 4837 20392 25229 expected: contingency table A B 770. 1 183. 2 467. 1.969E+03 3 4.187E+03 1.765E+04 chi-square = 26.5 degrees of freedom probability = 0.000 = 2 Significance for feature married, value Single: data: contingency table 1 2 3 A B 20 784 804 37 2060 2097 334 17548 17882 391 20392 20783 expected: contingency table 1 2 3 A 15.1 39.5 336. B 789. 2.058E+03 1.755E+04 chi-square = 1.77 degrees of freedom probability = 0.412 = 2 Significance for feature married, value Widow/er: data: contingency table 1 2 3 A B 24 784 808 106 2060 2166 412 17548 17960 542 20392 20934 expected: contingency table 1 2 3 A 20.9 56.1 465. B 787. 2.110E+03 1.749E+04 chi-square = 52.3 degrees of freedom probability = 0.000 Table 15 = 2 Significance for feature married, value Divorced and Male: data: contingency table 1 2 3 B A 10 407 417 18 917 935 178 10184 10362 206 11508 11714 expected: contingency table 1 2 3 A 7.33 16.4 182. B 410. 919. 1.018E+04 chi-square = 1.24 degrees of freedom = probability = 0.539 2 Significance for feature married, value Divorced and Female: data: contingency table 1 2 3 A B 18 377 395 51 1143 1194 308 7364 7672 377 8884 9261 expected: contingency table B 379. 1.145E+03 7.360E+03 chi-square = 0.424 1 2 3 A 16.1 48.6 312. degrees of freedom = probability = 0.809 2 Significance for feature married, value Married and Male: data: contingency table A B 1 115 407 522 2 251 917 1168 3 2959 10184 13143 3325 11508 14833 expected: contingency table A B 1 117. 405. 906. 2 262. 3 2.946E+03 1.020E+04 chi-square = 0.693 degrees of freedom = probability = 0.707 2 Significance for feature married, value Married and Female: data: contingency table A B 1 54 377 431 2 125 1143 1268 3 1333 7364 8697 1512 8884 10396 expected: contingency table B A 368. 1 62.7 2 184. 1.084E+03 3 1.265E+03 7.432E+03 chi-square = 28.1 degrees of freedom = probability = 0.000 2 Significance for feature married, value Single and Male: data: contingency table 1 2 3 A B 13 407 420 29 917 946 229 10184 10413 271 11508 11779 expected: contingency table 1 2 3 A 9.66 21.8 240. B 410. 924. 1.017E+04 chi-square = 4.12 degrees of freedom = 2 probability = 0.128 Significance for feature married, value Single and Female: data: contingency table 1 2 3 B A 7 377 384 8 1143 1151 105 7364 7469 120 8884 9004 expected: contingency table 1 2 3 A 5.12 15.3 99.5 B 379. 1.136E+03 7.369E+03 chi-square = 4.56 degrees of freedom = probability = 0.102 2 Significance for feature married, value Widower and Male: data: contingency table 1 2 3 B A 7 407 414 30 917 947 144 10184 10328 181 11508 11689 expected: contingency table 1 2 3 A 6.41 14.7 160. B 408. 932. 1.017E+04 chi-square = 18.0 degrees of freedom = probability = 0.000 2 Significance for feature married, value Widow and Female: data: contingency table 1 2 A 17 76 B 377 1143 394 1219 3 268 7364 7632 361 8884 9245 expected: contingency table A B 1 15.4 379. 2 47.6 1.171E+03 3 298. 7.334E+03 chi-square = 21.0 degrees of freedom = 2 probability = 0.000 Table 16 t "P 44 V% 47 table 84 r".: . eip Iag R" 45" gme ....- - . -o 585 -. r-OnN - - - . 7 1 - t 686 c - 4 -1a & 587 - - Ka f0jr -fagitu'pa,,tnartied vialytM d female. jond t 0It 4,, IV 141 .. ........... 3 77. -Al 138. -2 ini" AKY A '44 19 7 table, ting iWi 4a ,N NO On d,YO 60 *W14jaft feL ftjW c table A _196 L22 37' 6715 '6752 41 7229 7270 88 4p _X r ut ta ,j, ,71 .4, I F,441-0 V" 4*"' -777, 7 -77 ,01t+p32.1 0;1 4 ®r- !'a fit 0 v Al r4' Tn, A93 w cq#j#qAgnq tAbli6. 165, C3546+03 7 25. _bf keed m _jp 89 =bl 0727 Table 17 Significance for feature working, value Blue Collar: data: contingency table 1 2 3 A B 13 784 797 36 2060 2096 376 17548 17924 425 20392 20817 expected: contingency table 1 2 3 A 16.3 42.8 366. B 781. 2.053E+03 1.756E+04 chi-square = 2.05 degrees of freedom probability = 0.358 = 2 Significance for feature working, value Disabled: data: contingency table 1 2 3 A B 16 784 800 33 2060 2093 312 17548 17860 361 20392 20753 expected: contingency table 1 2 3 A 13.9 36.4 311. B 786. 2.057E+03 1.755E+04 chi-square = 0.648 degrees of freedom probability = 0.723 = 2 Significance for feature working, value Retired: data: contingency table A B 1 47 784 831 2 174 2060 2234 3 1281 17548 18829 1502 20392 21894 expected: contingency table A B 1 57.0 774. 2 153. 2.081E+03 3 1.292E+03 1.754E+04 chi-square = 5.00 degrees of freedom probability = 0.082 = 2 Significance for feature working, value Unemployed: data: contingency table 1 2 3 A B 12 784 796 17 2060 2077 282 17548 17830 311 20392 20703 expected: contingency table 1 2 3 A 12.0 31.2 268. B 784. 2.046E+03 1.756E+04 chi-square = 7.32 degrees of freedom probability = 0.026 = 2 Significance for feature working, value Volunteer: data: contingency table 1 2 3 A B 7 784 791 19 2060 2079 226 17548 17774 252 20392 20644 expected: contingency table 1 2 3 A 9.66 25.4 217. B 781. 2.054E+03 1.756E+04 chi-square = 2.74 degrees of freedom probability = 0.254 = 2 Significance for feature working, value Working: data: contingency table 1 2 3 B A 26 784 810 23 2060 2083 55 17548 17603 104 20392 20496 expected: contingency table 1 2 3 A 4.11 10.6 89.3 B 806. 2.072E+03 1.751E+04 chi-square = 145. degrees of freedom probability = 0.000 = 2 Significance for feature working, value White Collar: data: contingency table 1 2 3 A B 23 784 807 62 2060 2122 402 17548 17950 487 20392 20879 expected: contingency table 1 2 3 A 18.8 49.5 419. B 788. 2.073E+03 1.753E+04 chi-square = 4.86 degrees of freedom = 2 probability = 0.088 Table 18 Significance for feature working, value Blue Collar and Male: data: contingency table B A 1 7 407 414 2 23 917 940 3 260 10184 10444 290 11508 11798 expected: contingency table B 404. 917. 1.019E+04 chi-square = 1.06 1 2 3 A 10.2 23.1 257. degrees of freedom = probability = 0.589 2 Significance for feature working, value Blue Collar and Female: data: contingency table 1 2 3 B A 6 377 383 13 1143 1156 116 7364 7480 135 8884 9019 expected: contingency table 1 2 3 A 5.73 17.3 112. chi-square = B 377. 1.139E+03 7.368E+03 1.25 degrees of freedom 2 probability = 0.536 Significance for feature working, value Disabled and Male: data: contingency table 1 2 3 A B 9 407 416 13 917 930 180 10184 10364 202 11508 11710 expected: contingency table A 1 7.18 2 16.0 3 179. B 409. 914. 1.019E+04 chi-square = 1.07 degrees of freedom = probability = 0.586 2 Significance for feature working, value Disabled and Female: data: contingency table 1 2 3 A B 7 377 384 20 1143 1163 132 7364 7496 159 8884 9043 expected: contingency table 1 2 3 A 6.75 20.4 132. B 377. 1.143E+03 7.364E+03 chi-square = 0.196E-01 degrees of freedom = 2 probability = 0.990 Significance for feature working, value Retired and Male: data: contingency table A B 1 36 407 443 2 122 917 1039 3 971 10184 11155 1129 11508 12637 expected: contingency table A 1 39.6 B 403. 946. 2 92.8 1.016E+04 3 997. chi-square = 11.1 degrees of freedom = 2 probability = 0.004 Significance for feature working, value Retired and Female: data: contingency table 1 2 3 B A 11 377 388 52 1143 1195 310 7364 7674 373 8884 9257 expected: contingency table 1 2 3 A 15.6 48.2 309. B 372. 1.147E+03 7.365E+03 chi-square = 1.75 degrees of freedom = probability = 0.416 2 Significance for feature working, value Unemployed and Male: data: contingency table 1 2 3 A B 8 407 415 10 917 927 158 10184 10342 176 11508 11684 expected: contingency table 1 2 3 A 6.25 14.0 156. B 409. 913. 1.019E+04 chi-square = 1.67 degrees of freedom = probability = 0.434 2 Significance for feature working, value Unemployed and Female: data: contingency table 1 2 3 A B 4 377 381 7 1143 1150 124 7364 7488 135 8884 9019 expected: contingency table 1 2 3 A 5.70 17.2 112. B 375. 1.133E+03 7.376E+03 chi-square = 7.95 degrees of freedom = probability = 0.019 2 Significance for feature working, value Volunteer and Male: data: contingency table 1 2 3 A B 4 407 411 10 917 927 164 10184 10348 178 11508 11686 expected: contingency table 1 2 3 A 6.26 14.1 158. B 405. 913. 1.019E+04 chi-square = 2.31 degrees of freedom = probability = 0.315 2 Significance for feature working, value Volunteer and Female: data: contingency table 1 2 3 B A 3 377 380 9 1143 1152 62 7364 7426 74 8884 8958 expected: contingency table 1 2 3 A 3.14 9.52 61.3 B 377. 1.142E+03 7.365E+03 chi-square = 0.415E-01 degrees of freedom = 2 probability = 0.979 Significance for feature working, value Working and Male: data: contingency table 1 2 3 A B 18 407 425 6 917 923 38 10184 10222 62 11508 11570 expected: contingency table 1 2 3 A 2.28 4.95 54.8 B 423. 918. 1.017E+04 chi-square = 115. degrees of freedom = probability = 0.000 2 Significance for feature working, value Working and Female: data: contingency table 1 2 3 A B 8 377 385 17 1143 1160 17 7364 7381 42 8884 8926 expected: contingency table 1 2 3 A 1.81 5.46 34.7 chi-square = B 383. 1.155E+03 7.346E+03 54.9 degrees of freedom = probability = 0.000 2 Significance for feature working, value White Collar and Male: data: contingency table 1 2 3 A B 6 407 413 23 917 940 179 10184 10363 208 11508 11716 expected: contingency table 1 2 3 A 7.33 16.7 184. B 406. 923. 1.018E+04 chi-square = 2.81 degrees of freedom = 2 probability = 0.245 Significance for feature working, value White Collar and Female: data: contingency table 1 2 3 A B 17 377 394 39 1143 1182 223 7364 7587 279 8884 9163 expected: contingency table 1 2 3 A 12.0 36.0 231. B 382. 1.146E+03 7.356E+03 chi-square = 2.70 degrees of freedom = probability = 0.259 Table 19 2 & MAP Ok r M & AMMM G'4 -7 w a% I 6 "tv WiMMEME"e REKENP~iliBENN~limi10 M~t t! MaianaWW~eMMM45GaWRE MEN%9IIENE9M3E 99Ar I I777 M IM k 100 MM11 r&%#% M 0"MM -o T,%Ne ~ &#RNE ~ ~ TWMEWN5Et %%M MMEMMMNMMMMMl a 9 EMMlM& mmWE.. ... ....... M305%EEET;M 4 Z-- f@ a~ RMM%&&&~ei~n@1 WMWR 4o -9MMW# ?9M44MMMiGM M tee~ EbR5siERE l iliE1EE-E yv t7i~liiiR 1594 R7- ". et !f101 102 - -ti- 447 - 110 or --- 104 105 106 107 108 109 Table 20 110