Developing an Early Warning System for Congestive Heart Failure Using a Bayesian Reasoning Network by Joseph C. C. Su Submitted to the Department of Mechanical Engineering in partial fulfillment of the requirements for the degree of Master of Science in Mechanical Engineering at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY September 2001 @ Massachusetts Institute of Technology 2001. All rights reserved. Author ............... . .. i/-I Certified by................... . ... ........... Department of MechanicalEngineering August 23, 2001 - Kent Larson Principal Research Scientist, MIT Department of Architecture Thesis Supervisor Read by .................... ......... . Stephen Intille Research Scientist, MIT Department f Architecture Irhesis Reader R ead by .................... .......... . . . . . Sanjay E. Sarma Assistant Professor of Mechanical Engineering Thesis Reader Accepted by ................................ Ain A qnnin Chairman, Departmental Committee on Graduat tudents BARKER MASSACHUSETTS INSTITUTE OF TECHNOLOGY DEC 1 0 2001 LIBRARIES Developing an Early Warning System for Congestive Heart Failure Using a Bayesian Reasoning Network by Joseph C. C. Su Submitted to the Department of Mechanical Engineering on August 23, 2001, in partial fulfillment of the requirements for the degree of Master of Science in Mechanical Engineering Abstract We propose a framework for the development of a home-based early warning system for congestive heart failure (CHF). The system contains a diagnostic Bayesian reasoning network that uses probabilistic reasoning and evidence to arrive at a judgement. The network combines both simulated biometric data (daily weight and blood pressure readings) and actual position of the user to dynamically select context-specific health questions. These questions are presented to the user via a wireless personal digital assistant (PDA). Answers to questions and biometric data are used by a Bayesian network to dynamically calculate a probability that the user is at risk for CHF. We argue that current biometric sensing technology alone is inadequate to accurately establish a CHF risk factor; a Bayesian network that incorporates both biometic information and answers to context-specific questions may be a more accurate predictor. Thesis Supervisor: Kent Larson Title: Principal Research Scientist, MIT Department of Architecture Contents 1 1.1 1.2 3 4 5 3 . . . . 4 . . . . . . . . . . . . . . . . . . 4 D ata Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Congestive Heart Failure: Compliance and Prevention Issues Prevention-Based Home Healthcare System 1.2.1 1.3 . . . . . . . . . . . . Congestive Heart Failure: An Example and Overview 1.1.1 2 1 Introduction T hesis O utline 7 Related Prior Work 2.0.1 Disease Management Programs . . . . . . . . . . . . . . . . . . . . . 8 2.0.2 Medical Diagnostic Systems . . . . . . . . . . . . . . . . . . . . . . . 9 11 System Overview 3.1 Overview of the System Processes . . . . . . . . . . . . . . . . . . . . . . . 14 3.2 Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 16 Bayesian Reasoning Network for Congestive Heart Failure 4.1 Bayesian Reasoning Network: An Overview . . . . . . . . . . . . . . . . . . 16 4.2 CHF Network Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.3 Design Principles for CHF Network . . . . . . . . . . . . . . . . . . . . . . . 19 23 Question Querying Mechanism 5.1 5.2 Topology of a Querying Process . . . . . . . . . . . . . . . . . . . . . . . . . 23 5.1.1 Querying Heuristics . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 5.1.2 Sensitivity of CHF to Observations . . . . . . . . . . . . . . . . . . . 26 5.1.3 Sensitivity of CHF to Setting Observations on Evidence Variables . 29 . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Question Sorting Mechanism 3 6 5.2.1 Question Attributes . . . . .. . . . . . . . . . . . . . . . . . . . . . 31 5.2.2 Question Display Mechanism and Cycles . . . . . . . . . . . . . . . . 31 An Warning System: A Graphical Demonstration 6.1 33 Description of the System . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 . . . . . . . . . . . . . . . . . . . . . . . . . 35 6.1.1 The Graphical Interface 7 Discussion 40 8 Suggestions for Future Work 42 8.1 Feedback Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 8.2 Factor Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 44 A Probability Theory A.1 Probability Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Bayesian Network: Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . 46 A.1.1 A.2 Bayesian Inference 48 B Construction of a Bayesian Network B.1 An Example: Conditional Probability Table . . . . . . . . . . . . . . . . . . 50 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 B.2 Network Topology 52 C Database Output: Database of Questions . . . . . . . . . . . . . . . . . . . . . . . . . 52 C.2 Input: Database of Contextual and Biometric Data . . . . . . . . . . . . . . 53 C.1 D Software Implementation of a Bayesian Network D.1 54 Tools and Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 . . . . . . . . . . . . . . . . . . . . . 55 D.2 CHF Network Representation in Java E CHF Diagram 57 F Glossary of Medical Terms 58 F.1 Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 F .2 G lossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 4 60 Bibliography 5 List of Figures 3-1 Differences and Similarities Between an Expert Diagnostic and Early Warning Systems in the CHF Disease Domain . . . . . . . . . . . . . . . . . . . . 13 3-2 A Control Paradigm for an Early Warning System . . . . . . . . . . . . . . 14 4-1 Bayesian reasoning network for CHF . . . . . . . . . . . . . . . . . . . . . . 21 4-2 Causation Diagram for a Diagnostic Network . . . . . . . . . . . . . . . . . 21 4-3 Causation Diagram for CHF . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 5-1 Flow Diagram of a Querying Process . . . . . . . . . . . . . . . . . . . . . . 24 5-2 Setting Observation in a CHF Network . . . . . . . . . . . . . . . . . . . . . 28 6-1 Question Displayed on a Palm Pilot Vx . . . . . . . . . . . . . . . . . . . . 34 6-2 Display of Both Contextual and Biometric Data . . . . . . . . . . . . . . . 37 6-3 Dynamic Highlighting of Q-Orthopnea Question Variable in a CHF Network 38 6-4 Display of Both Contextual and Biometric Data as A Slider Bar Moves . . 39 8-1 A Feedback Closed-Loop Design for an EWS . . . . . . . . . . . . . . . . . 42 . . . . . . . . . . . . . . . . . . . . . . . . . . 50 B-1 Two-Node Bayesian Network C-1 Question Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 D-i Bayesian Reasoning Network for CHF Using JavaBayes Package . . . . . . . 55 E-i Dataflow Diagram of Critical Variables for CHF . . . . . . . . . . . . . . . . 57 6 List of Tables B.1 Discrete probabilities for CHF-HTN Variables ................. D.1 Conditional Probability Table for CAD-Angina Link ......... 7 50 . 56 Chapter 1 Introduction United States faces a considerable challenge in providing healthcare for its people in the coming years. A stream of expensive medical innovations and procedures have exacerbated the dilemma of providing high-quality care at a reasonable cost. While innovative diagnostic tools and treatments have improved healthcare by providing less invasive procedures and promising more effective outcomes, healthcare spending due to the number of patients using the technology is on the rise [49]. Experts predict that by 2025, 5.3% of the gross domestic product will be spent in Medicare, compared with 2.7% in 1998 [47, 46]. Per capita spending on healthcare reached $3,000 in 2000, making United States the number one nation in healthcare spending [45]. In nursing home and home care costs alone, the U.S. government spent $115 billion in 1997 [48]. This figure will continue to rise due largely to hospitalization of elderly in the final decade of life. Vigorous efforts exist to reduce the escalating costs of care. Cutbacks in both the number of patient treatments and length of clinic visits, allowed by managed care organizations (HMO's), result in patients being discharged prematurely in the treatment cycle. Home healthcare providers (HHP's) are attempting to provide more affordable home care services [1]. In 1995 alone, there were more than 17,500 HHPs that delivered services to seven million patients. Providing medical education to patients and instructing them on how to self-administer certain medical procedures is common practice among HHP's and HMO's. Many cost-saving mechanisms have also been proposed, such as liberalization of Medicare's treatment benefits to include preventative home care [1]. Preventative home healthcare reflects a wider trend in healthcare today. Various na- 1 tional health programs have proposed a slew of clinical treatment plans for individuals with various diseases, to counteract cost and enhance quality of life [23, 2]. These plans include encouraging patients to take control of their health by pursuing more rigorous health exercises, eating healthier diet, carefully self-monitoring changes in their physiological states, enrolling in commercial disease-management programs, etc. One program with 29 patients reduced the average hospital readmission rate from 1.5% to 0.13% per year, while the actual emergency room visits from 17% to 3% a year [56]. In another program with 238 patients, there was an observed 69% decrease in hospital visits, saving each patient by as much as $8,000 healthcare cost a year [52]. Another trend in healthcare is a shift of patient care and responsibility out of the hospital into home. One trend in healthcare is increasing reliance on the use of computerized equipment, and application of technological advances and medical innovations for both diagnosis and treatment [44, 49]. We see more and more prevention-based disease-management programs come into play in the new economy era laced with increasing healthcare expenditure. Some providers employ live operators and automated mechanism to talk to and collect health information from their patients via telephone-linked care (TLC) programs, and other providers employ disease-management programs that involve the use of biometric devices to non-invasively acquire patients' vital signs and monitor them telemetrically [51, 52, 50]. Many cardiac patients are having their vital signs monitored from home as disease-management programs are becoming increasingly available [51, 52]. These programs try to optimize outcomes through supportive technology, such as home-care automation, that links patients to providers. Studies have indicated the efficacy of home monitoring for heart-related diseases to prevent crises from occurring [39, 38]. By enabling health monitoring automation in the home, a home owner's health status can be continuously assessed and incrementally stored, so that healthcare professionals can be on a constant watch of patient's health to make opportune clinical intervention before a crisis occurs. Preventative home healthcare is an important solution to reducing healthcare costs, as it promotes early detection of diseases through the patient's self-awareness and the use of improved home care technology that provides more continuous monitoring of people's health. However, effectively increasing a patient's self-awareness of his or her health status is not easy. Studies have shown that more than half of all Americans with chronic disease also do not follow their physician's medication and lifestyle guidance. In one study, 30.6% 2 of participants did not adhere to their medication schedules [66]. In another report, it was noted that nine out of ten make mistakes taking their medication, and two-thirds fail to take any or all of their prescriptions [67]. More studies have also shown that moderate exercise per day could result in significant improvement in overall health [63, 64], yet two-thirds of people over 65 do not regularly exercise [65]. A reverse trend in the near future is: as life expectancy for general population increases due to better treatments and preventative medicine, we may still see more costly medical episodes developed over the course of a longer life span for each individual. We propose a home-based early warning system that aims to prevent or defer the occurrence of costly medical episodes, by allowing users to be on an active and constant alert of their health status. In the following section, we describe a medical condition known as congestive heart failure (CHF). It has become one of the leading medical conditions that, if detected sooner, could result in tremendous savings in national budget. In Sec. 1.1.1, we describe CHF-related compliance and prevention issues and their impact to improving the prognosis and quality of life. 1.1 Congestive Heart Failure: An Example and Overview CHF is a serious end-stage of cardiovascular disease, where the heart is unable to pump an adequate supply of blood to meet the oxygen requirements of the body's organs and tissues. In year 2000 alone, it was estimated that $21 billion dollars were spent either directly or indirectly in treatments for congestive heart failure [23]. CHF is increasing in prevalence, resulting in more hospitalizations and deaths making it a major chronic condition in the United States [25]. It was reported that in 2001 when the study was issued, that there was an estimated 4,700,000 Americans who had been diagnosis congestive heart failure [23]. CHF causes fluid retention. Fluid accumulates in the heart and other parts of the body such as the lungs and legs. Causes of CHF include cardiovascular problems such as, but not limited to, coronary artery disease (CAD) and myocardial infarction, hypertension, cardiomyopathy, heart arrhythmia, congenital heart defects, and heart valve abnormalities. Depending on which side of heart is failing, symptoms sometimes overlap and vary. Symptoms are dyspnea (shortness of breath), orthopnea (difficulty breathing when lying down), edema (swelling of joints, abdomen, liver, spleen, and lungs), weight gain, fatigue 3 or weakness, loss of appetite, and nocturia (an increase urination at night) [44]. 1.1.1 Congestive Heart Failure: Compliance and Prevention Issues It has been observed that compliance with a CHF treatment plan and careful monitoring at each hospital visit will improve a patient's prognosis and quality of life [36, 43]. The interval of office visits for 90% of CHF patients is between 2 to 4 months [42]. During this period, there may be occasional cardiac symptoms such as breathlessness, ankle swelling, or even a small body weight gain in the patient. These symptoms may be an indication that the patient's condition is progressively deteriorating and re-hospitalization is required. Rapid deterioration occurs in the case of an acute heart failure. Subtle physical indicators, if any, must not be neglected by the patient. Ignorance may lead to a rapid deterioration that requires immediate medical attention. It was reported that, in geriatric patients with CHF, hospital re-admission after 3 months was 30% [55]. By increasing patient's self-awareness to detect early symptoms of CHF and the monitoring frequency of patient's biometric data, the rate of re-hospitalization might be reduced. This is because 1. Subtle symptoms might be caught at an early stage of deterioration. 2. Biometric readings taken at a much shorter and frequent interval than 2-4 months might indicate gradual changes in the patient's health. Emerging downward trends might be detected sooner. Studies have demonstrated a clinical improvement of heart-failure patients participating in a comprehensive home-based heartfailure management program [23, 54]. 1.2 Prevention-Based Home Healthcare System Home healthcare is becoming more mainstream due to its cost-effective, proactive, and preventative nature that replaces the current reactive, episodic, and crisis-driven care delivery model that takes place in a clinical setting. In light of the current healthcare trends in CHF, this work proposes a framework for the development of a home-based early warning system for the prevention of CHF. The system might: 1. Improve patients' awareness of their health 4 2. Eliminate gaps in care by providing continuous health monitoring during daily living 3. Supplement existing biometric sensing technology by collecting new health data There are two types of data that can be collected from a home occupant: numerical values from biometric readings and contextual information about the setting in which these readings are taken. Context could involve the physical or emotional state of the individual, where the user was at what time, how the user felt, and if the user has been exhibiting any symptoms in addition to the biometric information collected. In this work, the system actively queries the user to tag biometric readings with contextual information about the user's situation. This information is used by the system to dynamically establish a medical risk factor for the individual, allowing for early detection of CHF. If fully developed, this early warning system might become a vital component of proactive and cost-effective home medical care. 1.2.1 Data Collection Continuous and Periodic Biometric Readings from the User. The home is a good place for collecting a series of biometric readings for the prevention of CHF: weight, blood pressure, oxygen saturation rate, pulse and heart rate, and glucose level. These numerical readings indicate certain physical conditions about the individual, but the readings alone may not confirm that an individual is experiencing CHF-related physical symptoms, such as exertional dyspnea (difficulty breathing upon the exertion of force), orthopnea (difficulty breathing laying down), lower leg edema (swelling), or angina pectoris (chest pain). Continuous Contextual Information (when, where, and what) from the User. Home environment is a place where we can find out a lot about its occupants, including the context surrounding their activities of daily living, i.e., eating habits and gait patterns. These contextual information contain essential yet often neglected subtleties where we may infer gradual health changes from an individual, catching latent health maladies before they become pronounced. If subtle qualitative changes in symptoms can be detected early on, we may prevent or defer the onset of rapid deterioration from occurring. If the patient's condition progresses to the level of a serious ailment, this 5 contextual history can provide physician with a much richer insight into the patient's health condition. 1.3 Thesis Outline This thesis describes the framework for a demonstration system: an adaptive home-based early warning system for CHF. The approach uses data collected from the home occupant in the context of daily living. Chapter 2 discusses related work in the area of preventative medicine. Some of these works include disease-management programs involving telephonic health monitoring in a home setting, and diagnostic system development in a clinical setting. Chapter 3 gives an overview of the CHF system framework. Chapter 4 describes the steps taken to construct a Bayesian reasoning network for an early warning system. Chapter 5 details the system's querying mechanism, in which health questions are generated by the system and sent to the user, and responses to the questions are received from the user via a personal digital assistant (PDA). Chapter 6 puts details from both Chap. 4 and Chap. 5 in perspective, describing the system's decision-making process and illustrating graphically how both contextual and biometric information are used in reaching a diagnosis. Chapter 8 makes recommendations for future modifications and improvements of the system. 6 Chapter 2 Related Prior Work There has been an emergence of medical innovations that take advantage of telemetric and wireless applications in communication to monitor health or activity levels of an individual [50, 51, 59, 60, 62]. Most of these disease management programs utilize telemetric devices to measure the patient's bio-data and transmit the information back to a healthcare provider over the phone. Some employ live operators or an automatic transmission mechanism to phone patients and solicit health information directly from them [57, 38]. Some others even provide electronic portable systems that have messaging capabilities, sending medication reminders and health questions to the patient [52, 58]. However, few of these commercial innovations actively seek to apply artificial intelligence (AI) reasoning for medical diagnosis. A comprehensive disease-management program or treatment plan might include patient education, patient self-assessment, collection of patient data and access to the data, and methods of measuring treatment compliance and for making the data available. For patient data acquisition, numerous commercialized biometric sensors (such as blood pressure meter, weight scale, etc) exist for home-based health monitoring. These systems, however, do not establish the context in which the data is acquired, such as where the user was, how the user felt, or if the user had been exhibiting any symptoms in addition to the biometric variables collected. In a clinical setting, a doctor can ask the patient many questions to gather information about context. In the absence of a doctor at home, a homecare system that can ask questions and gather responses to those questions from an individual can provide context for and supplement biometric data. The following sections describe some of the related prior works in the area of preventative 7 medicine. 2.0.1 Disease Management Programs Disease management programs attempt to identify patients in need of treatment, intervene with specific programs of care, and measure outcomes. These programs focus resources on high-risk or common disorders, and have the potential to improve treatment and reduce clinical costs for patients with asthma, depression, diabetes, and congestive heart failure [71]. Telephonic Monitoring Systems Researchers have been conducting pioneering research in telephonic health monitoring over the years [40, 41, 57]. In a telephonic protocol, nurses or medical professionals make frequent calls to patients soliciting their health status and recording down the information electronically. Computerized telephonic systems have been used to monitor the health of large volume of patients [38]. Health organizations also have telephone-linked care programs (TLC) to monitor the health of their patients [68, 69]. Telemetric Monitoring Systems with Biometric Devices Telemetric health monitoring systems have been in use since the late 1980's [57]. Recently, these systems have been automated to collect bio-data from the user at home [41, 40]. Examples are automatic monitoring of blood pressure in patients' home with weekly reports to doctors [59, 51, 62], automatic tracking of the user's activities and even sleep patterns [61], and various systems that come equipped with biometric devices for CHF-related or other measurements [59, 51]. Still there are others that take a step further to enable interactive medicine between a portable telemetric display and the patient. One of the examples is Health Hero Network's Health Buddy, which sends patients reminders and provides them with feedback on their progress and tips for managing their disease more effectively [52]. Another example is InforMedix's Med eMonitor that comes with a medication dispensing mechanism and electronic messaging system similar to that of Health Buddy's [58]. In studies conducted with Catholic Healthcare West CHF Program and MDS Pharma Services, the results indicated that Health Buddy was a cost-saving tool that had perceived value to patients and providers [53]. 8 2.0.2 Medical Diagnostic Systems An expert system represents a knowledge base of information and searches for patterns in it, modeling how a human expert analyzes a particular situation by applying rules to the facts or comparing the current case with similar cases. Expert systems utilize different types of reasoning methods such as fuzzy-logic, neural networks, and Bayesian networks. The most common expert system is rule-based, containing a knowledge base and an inference engine (i.e., routing mechanism) which analyzes fact patterns and matches the applicable rules. Fact patterns are analyzed until either the goal succeeds or all of the rules are processed and the goal fails. Medical diagnosis is an application area that utilizes reasoning in artificial intelligence (AI) [7, 9]. A medical diagnostic model is generated by acquiring evidence such as both symptoms and signs, determining a set of faults or causes associated with the evidence. Diagnosis is determining the cause of a pathological state [10]. Whenever new information is obtained, the system generates hypothesis of the patient's current condition given the model. Many medical expert systems, or diagnostic programs, employ reasoning to make prognosis and diagnosis of medical disorders, and identify an appropriate course of treatment for the patients. One such reasoning is a rule-based reasoning, with knowledge catalogued in the form of IF and THEN rules used in chains of deduction to reach a conclusion [72]. However rule-based programs suffer a serious drawback, namely, they do not embody a model of clinical reasoning or disease, leading to unfavorable interactions between rules and thus to serious degradation of program performance [72, 73, 74]. Another reasoning is fuzzy logic, where truth values become real values in the closed interval [0 ... 1]. The rules are designed to return vague values like "closer" or "very tall". This approach is used only when a system is difficult to model exactly and an inexact model is available, or when ambiguity or vagueness is common. Numerous medical diagnostic programs have employed an AI approach called Bayesian reasoning [17, 11, 12, 19, 18, 13, 20]. Two of these examples are the Pathfinder system used for lymph-node diseases [34], and Long's Heart Failure Program, which simulates in great detail the relationships among certain physiological, etiological, gravity, and pathophysiological states affected by both disease process and therapy [26]. Long's program employs a so-called pseudo-Bayesian reasoning, which incorporates the severity of disease 9 states and the temporal relations of causality to determine the mechanisms that produce the evidence, as well as to determine the primary disease causes [10]. As disease domains increase in complexity, more detailed data such as the level of disease manifestations, severity of disease, and types of complications, are needed in these programs. The Bayesian-based reasoning approach has becoming increasingly popular because of its adaptive variety and powerful learning component involving Bayesian network, which uses techniques of probability theory to reason under conditions of uncertainty. Unlike other reasoning approaches, Bayesian networks can explain their reasoning and incorporate probabilistic data from published literature, and are useful for representing uncertain relationships where statistical data and prior knowledge are available. Furthermore, both probabilistic dependencies and constraints are made explicit in a directed acyclic graph (DAG) of a Bayesian network. Therefore, Bayesian reasoning approach is the most solid and flexible option for the system we develop. However, one challenge associated with using Bayesian reasoning is to create a network that accurately describes the inter-dependencies of causes and effects in the domain of interest. Much effort and research are needed to build the network, by experts who judge and come up with the probabilities used in it. A Bayesian network used in any medical expert system can be optimized. The process of modifying and optimizing a network involves a lot of expertise, trial and errors, and time. It took a consortium of engineers, scientists, and physicians a total of 44 weeks to build the Bayesian network used in the Pathfinder system [34]. One of the research questions for this thesis work is how to create a Bayesian network that is reduced in complexity and optimized for CHF disease domain. Another question for this work is how to tag biometric data with contextual reliability through the use of network, so that the system can compute a dynamic risk factor for an individual living at home. These questions can have a big impact on the development of a preventative home system, which will be described in the following chapter. 10 Chapter 3 System Overview We test a first step towards developing a knowledge-based early warning system for CHF, making use of Bayesian reasoning. The system makes use of the following components: 1. A Bayesian reasoning network: The system uses a Bayesian network for diagnosis. Diagnosis in an early warning system establishes a risk factor, predicting how likely that a person is to develop CHF. 2. There are three types of input for the system: Simulated biometric data and symptoms relevant to CHF A simulated medical history chart was made that includes daily biometric readings and symptoms indicative of a person developing CHF. The history on the chart spans a period of 3 weeks, starting with the first week on the chart when CHF symptoms are almost non-existent or mild and slowly progressive, ending with the last week when CHF starts to acutely develop. Contextual information (when, where, what) from the user As mentioned previously, context captures where the user was, what the user was doing, and the time in which the action took place. In this work, contextual information about users are obtained from their responses to health questions that the system generates. A health question is a medically informative question with the following characteristics: * Presented in a non-intrusive way " Calming, not fear-inducing to the user 11 e Non-annoying to the user An example question from the current system is: "Do you have difficulty breathing even after the window is opened?". An answer of "yes" signifies that the user may be experiencing dyspnea, or breathing difficulty, even after the window is opened and more fresh air is in. When more questions are asked and hence more responses gathered, the system gradually becomes more aware of the user's health status. Context and location/time information The system connects to a set of tracking sensors, which detect the user's whereabouts in a room at any given time. The location information is used to select location-specific health questions, depending on the user's whereabouts and known positions of objects (e.g., a desk) in the room. The time information is used to select the most appropriate, i.e., temporally relevant, location-specific questions to display on a Palm Pilot Vx. User responses to health questions Based on the user's biometric and context history, the system dynamically generates an appropriate set of health questions categorized by the types of locations. The location information, provided by inhouse sensors, allows for the generation of the most appropriate location-specific health question at any given time. Questions are sent to the user via a personal digital assistant (PDA). 3. System output-Probability of a CHF risk factor: Using the history of the user's biometric readings and responses to health questions, the system dynamically computes a CHF risk factor for the user. A risk factor is the probability (ranging from 0 to 100%) of a person developing CHF. Based on statistics, i.e., the number of CHF patients versus the total population in the U.S. in 2001, the CHF risk factor for an average American is roughly 0.84%.1 For example, when a noticeable medical condition emerges that warrants an immediate medical attention-such as when the user is, say, 25 times more likely than a normal person to have developed a serious disease-the system might notify the home occupant of the risk and encourage the occupant to seek a doctor, or notify the family or a doctor directly if appropriate. 'This is the actual number calculated in Appendix B, based on the number of CHF patients and current U.S. population. In this paper, we have developed a Bayesian inference model that uses 0.96% as a starting point due to unavailability of some related statistical values, e.g., risk factors for orthopnea, edema, etc. 12 Fig. 3-1 compares an expert diagnostic and early warning system. In the figure, P(X) signifies the probability of X given known evidence. Medical Diagnostic System P(severity, etiology, typ es, manifestations of CHF) Early Warning: % at risk P(CHI) Continuous input over time Occasional and more intensive input Home setting by yourself Biometric data, environmental context Clinical setting with doctors lab tests, results, signs, symptoms, etc Figure 3-1: Differences and Similarities Between an Expert Diagnostic and Early Warning Systems in the CHF Disease Domain An early warning system is essentially an expert diagnostic system, arriving at a clinical conclusion based on accumulating evidence about an individual. An expert diagnostic system is used in a clinical setting, incorporating a sparse set of biometric data, and clinical observations such as symptoms from the patient. In a clinical environment, a physician gains insight into the patient's ailment by taking measurements and asking the patient a few health questions. Observable signs and symptoms are further confirmed and validated by the patient's responses to those questions. In clinical terms, symptoms are any abnormal changes in appearance, sensation, or function experienced by a patient that indicate a disease process [44]. Signs, on the other hands, are abnormalities that indicate a disease process, such as a change in appearance, sensation, or function, that is observed by a physician when evaluating a patient [44]. Under a stressful clinical setting, the patient might deny the existence of symptoms to avoid facing the implications of a real problem. In other instances, a patient may exaggerate a condition to gain attention from the doctor [441. The early warning system developed here is intended for use in a home setting. It makes a diagnosis based on a continuous and periodic flow of biometric data, and feedback from an ample amount of subtle contextual information. The system asks a home occupant questions as a doctor might do if the doctor were at home, in order to ascertain clinical information not already encoded or obvious in the patient's biometric data. Whereas weight 13 or blood pressure changes might entail signs such as edema or hypertension, as observed by the system, these signs may be further confirmed by symptoms that a home occupant might have and is experiencing. The occupant's responses to dynamically changing and contextrelated (i.e., location-specific) health questions become symptomatic evidence entering into the system. A question might be "Do you feel winded often after standing at the kitchen counter for more than 5 minutes?". A "yes" response to this question suggests that the occupant may be experiencing dyspnea and this symptom impacts a CHF diagnosis. By and large, the system has to be efficient and acquire as much information or context about the user as possible, without causing the user to turn off the preventative monitor out of annoyance. At the same time, it has to be effective in providing more precise diagnosis at any time. 3.1 Overview of the System Processes The proposed early warning system can be represented in a control system paradigm depicted in Fig. 3-2, consisting of a feedback loop that illustrates the question querying mechanism. A feedback loop exists between the user query and Bayesian network processes. Response entered Context Location /Bayesian time Us er Reasoning Biometnc.imtre Query NetworkII Data Question generated Prognosis or future occurrence of a disease Figure 3-2: A Control Paradigm for an Early Warning System Depending on the user's whereabouts and the user's history of responses, location-specific questions will be generated. 14 3.2 Scenario Let's consider a scenario of an early warning system in action. An 83-year-old woman lives in a home by herself. She has had a few symptoms related to CHF, such as edema and hypertension with coronary artery diseases (CAD's). The system developed here aims to diagnose the individual in a home setting. When she enters her bedroom one night after supper, without having any information about her other physiological states, the system believes that she has CHF with a probability of 1.96%. This number might be different than that for the general population (0.86%), because the user has entered into the bedroom and might have done something to offset this number-say, she weighs herself on a bedroom scale after a full supper. While connected to the system, the scale indicates that she has a 4% increase in weight after comparing her average weight from last week. This weight gain changes how an early warning system sees her as a candidate for developing CHF. The system's view changes, however, when more new evidence arrives. The system may ask her a bedroom-specific question: "Did you experience any difficulty breathing today?". If the user's response to this question is "yes", this bit of evidence enters into the system which believes that she has a higher chance of developing CHF. In summary, generation of context-specific health questions is based on the user's whereabouts, time, and all the other previously entered evidence, such as the user's weight and blood pressure level. The user receives these health questions via a portable electronic device such as PDA. Responses to context-specific questions enter into the system (via the PDA), becoming additional evidence allowing the system to generate the next set of questions when appropriate. An early warning system absorbs context via a question querying mechanism that involves asking questions to and receiving responses from the user. Each question is tagged with contextual reliability. Meanwhile, the system periodically receives biometric data from the user. These data become another type of evidence, enabling the system to make more precise assessment of the person's health. In this work, approaches have been taken to apply reasoning that drives the querying mechanism; we do not consider how biometric data can be obtained. 15 Chapter 4 Bayesian Reasoning Network for Congestive Heart Failure A technique called Bayesian reasoning is often used in disease diagnosis [10]. In this work we describe the framework for the development of a home-based early warning system, which employs Bayesian inference to predict the likelihood of having congestive heart failure (CHF) in home occupants. 4.1 Bayesian Reasoning Network: An Overview Bayesian reasoning networks, also called belief networks, knowledge maps, or probabilistic causal networks, have become the most popular methods for describing and reasoning with probabilistic information, using a graphical model that topographically represents probabilistic relationships or dependencies among a set of variables [8, 15, 16]. A Bayesian network contains two major attributes as described in Appendix A: directed acyclic graph (DAG) and conditional independence. Both of these attributes are important in the system we have developed. Bayesian networks offer several advantages over other reasoning-based networks. The advantages include: 1. Bayesian networks are robust at handling incomplete sets of data, offering the power of prior knowledge, which is embedded in the causal semantics of the network to predict the outcome of a process [15]. The network encodes conditional dependencies 16 among the input variables, allowing for prediction of an outcome even when inputs are not completely observed. This ability is much similar to what a physician has when deciding on which drugs to prescribe to the patient. For example, a physician may want to know whether or not to prescribe drug A to improve the patient's health. To arrive at the best conclusion, the physician can determine if drug A directly contributes, and to what degree, to the well beings of most people with the same physical symptoms-even when no close-at-hand information about the effects of drug A to patients is available. 2. Knowledge of causal relationships in a Bayesian network can be represented in a graphical model or structure, facilitating our understanding of a problem domain'. The graphical structure consists of mathematical relationship that is made explicit and well-understood, and can be rearranged to increase computational efficiency and to weight evidence softly. 4.2 CHF Network Overview Medical literature was used to ascertain information on probabilities and dependencies used in the network, which was created by hand. To improve the quality and accuracy of a diagnostic network, it will be necessary to consult medical experts and to continuously enhance the network. Heart failure disease encompasses multiple disease etiologies (causes) and patterns of manifestations (effects) [10]. There is a variety of symptoms related to CHF [44]. In this work, symptom is a node in the network and is arranged in the order of causality on the DAG. For example, both lower-leg edema and orthopnea supercede unexpected weight gain on the DAG, because the former two contribute to fluid retention in the body resulting in the latter. It must be noted that some variables carry more weight than others in terms of their impact to heighten or lower the probability of CHF risk factor, or P(CHF). For example, an individual that has dyspnea is less likely to have CHF than another who has shown symptoms for orthopnea. This is because dyspnea can be a direct result of other precipitating factors such as arrhythmia or asthma, whereas orthopnea is more specifically a consequence of pulmonary edema which indicates a weak heart. When assigning conditional 'In this work, the problem domain is referred to as the disease domain for CHF. 17 probabilities to the links for all variables in the network, we have to bear in mind the impact of these variables on P(CHF). In this work, key variables and their relative importance in a CHF domain were primarily ascertained from a quality care assessment study for CHF [21]. In an article by Ashton et al, it was found that both dyspnea (difficulty breathing) and orthopnea (difficulty breathing when lying down) are one of the most common hence important symptoms of CHF, concurred by various physicians. That is, the presence of either symptom during pre-admission determines if the patient is already at-risk for CHF. The objective of the quality care assessment study was to rank variables that are most relevant, common, and specific to the evaluation of CHF patients in hospitals. The block diagram presented in App. E is a graphical representation of inter-quartile ranking of these variables set forth in [21]. An initial network was made consisting of some of the key variables in [21], as illustrated in Fig 4-1. Suggestions from medical doctors allowed us to make modifications to our network and assignments of conditional probabilities for each link, simplifying the network by employing only the higher-level variables that are most relevant and important in the domain of CHF [75, 76]. Fig. 4-1 depicts a graphical representation of the network, with variable names, their states, and prior probabilities listed. In the network, biometric variables such as weight and blood pressure are prefixed with a B, whereas contextual variables with a Q.. The target variable, CHF, does not come with any prefix. Associated with each prefixed variable is an evidence node, which is used to weight the incoming evidence for the same variable. Each evidence node in the network has 10 states that include 0% < a < 10% to 90% < a < 100%. The use of these states and their meaning are described in Chapter 5. Each of the eight states associated with a represents severity in evidence. For example, if the evidence for Q-SkippedMed is that the patient has been skipping medications for 10% of the time since last week, the state 0% < a < 10% in EQSkippedMed will be set true to adjust the probability distribution underneath QSkippedMed. 0% < a < 10% represents the state of severity in the QSkippedMed category for that individual at that given time. When the patient has completely forgotten to take medications 95% of the time since last week, the state 90% < a < 100% in EQ-SkippedMed is set true. The effect of setting a state in an evidence node propagates throughout the network, changing its joint probability distribution. In 18 the last case of the patient with a weekly 5% medication compliance rate (i.e., skipping medication 95% of the time), P(CHF)for this individual becomes 0.0206 as compared to 0.0196 for normal people. Chapter 5 elaborates on the mechanism of how evidence are weighted. Note that evidence nodes associated with biometric variables are prefixed with a EB_ and contextual variables with a EQ_. In a home environment, B_ variables are measured with biometric devices such as weight scale, pulse monitor, glucose meter, to name a few. Note that in the early warning system developed in this work, biometric data include both weight and systolic blood pressure. Contextual questions, however, are dispensed to the patient via a PDA, hence the prefix Q_ which denotes question variable. 4.3 Design Principles for CHF Network Design Principle 1: Causation used in the ordering of variables in CHF network goes from predisposed signs, internal state of disease, to symptoms. In a diagnostic reasoning model, clinical signs lead to internal conditions or failure states, which leads to a plethora of inter-related observables, or symptoms. We argued that predisposition, or clinical signs, relating to heart failure influences the likelihood of developing CHF, which leads to a plethora of symptoms. In this way, both signs and symptoms are made independent of each other, i.e., they are indirectly related through the target, CHF. The causal relationship for CHF is depicted in a flow diagram in Fig. 4-2. In this way CHF is immediately dependent on either predisposition, which consists of hypertension, coronary artery disease, and medication compliance, and symptoms, which includes respiratory problems and fluid retention. A graphical breakdown of Fig. 4-2 is shown in Fig. 4-3. Note that Fig. 4-3 is a diagnosticcausal network that infers the presence of CHF based on percept-driven information, namely, signs or known symptoms of CHF. Thus, even though the network is ordered causally the direction of diagnostic reasoning goes from effects to causes [33]. Design Principle 2: In the network, signs and symptoms are conditionally independent. To satisfy this independence assumption, we make sure that there is no 19 dependent links between signs and symptoms, i.e., they are conditionally independent of each other. Design Principle 3: In the network evidence nodes, EQ_ and EB_, might be conditionally dependent on the other non-evidence nodes. For example, EQAngina and QCAD are dependent on each other since both are linked to QAngina. This is plausible considering that the probability of someone having angina at any given time relates to whether the individual also has a history of developing any coronary artery diseases. 20 EQSmoking 0%<a l10% 10% <a 20% 20% <a &30% 30% < a s 40% % < a s 0% 60% <a60% 60% < o70% 70% <a 80% a0%< a 90% 90% s100% <a 10% 10%<as 20%<ms 30% < a s 40% < a s 50%<as 20% 30% 40% 50% 50%<p 60% 60%<asl70% 70%<as 80% Presence 12.5 NonAlcoholic 87.Absence B Hypertension Absent EQHighChol s 10% 10% <a s 20 % 20% < as 30% 30% <a s 40% 4U% < a s 50% &0% < a 5 60% 80% < a & 70% 70% <as 80% 80% <as 90% 90% <a 100% 0% < a EQ_SldppedMed 0% < M 5 10% 10% < 2a20% 20% <a s 30% 30% <a s 40% Q_SldppedMed 30.0 Yes 70.0 No E%_<yspn10%0% 0% < as 10% a(s 20% 20%<a&O0% 40%<540% 5U%<a20% 00% < acs 70% 70% < as 70% a.sB 0% a s 90% 96% <a Heart Failure 1.95 False 93.0 Q-Dyspnea QO-thopnea Presence Absence 5.73 Exertional . NonExertional 523 Q _EdeOma Presence 6.28 93.7 Absence 5.2 BWeightGin a 596 Sudden Progressiwe 94.0 1 EQEdema 0% < a s 10% 10% < a r 20% 20% < a s 30% 30%<as40% 40% < a 50% 50%<as60% 60% < as 70% 70% < asS 80% 80% < a 90% 90% < a 100% EBWeightGain 0% < ps 10% 10% < p r 20% 20% < P s 30% 30%<pes4% 40% < p . 50% EQ_Orthopnea < a s 10% 10% < as 20% 20% < as 30% 30%<ms40% 40% < a: 0% 50%<as60% 60% < asl 70% 70% < a s 80% 80% < a s90% 90% < as s 100% 92.3 EQ_Angina 10% 0% < M 10% <a s 20% 20% < a 30% 40% 30% <a 50% 40% <a (60% 50% <a 60% <as 70% 80% 70% <a 80% <a s 90% 100% 90% < a 50% EQ.oyspnea 70% < 90% < Congestiv True G-CAD 7.70 QAngina Presence 3.39 Absence 96.6 799 tHigh~hol HighCholest... 16.5 40% <a s 60% 60% <a s 70% 70% < a s 80% 80% < a e 90% 9% < as 100% 10% < 60%<as 70% 70%<as 80% 80%<as 90% 90% <a s100% g Alcoholic Q.Smoldng 24.6 Smoker NonSmoker 75.4 % < as 50%<as 60% 60% 60%<ps70% 70%<ps80% 80%<p s90% 90% <p s100% 80%<Ms 90% 90% <as 100% QD-inking EQ_CAD 0%<a 10% 10%<as 20% 20%<as 30% 40% 30% < a 40% < a & 0% EB_Hypertension 0%< P10% 10%<p s20% 20%< Ps30% 30% < 40% 40% < 50% EQ_Drin kdng 0%<as 50%<ps0% 60% < p s 70% 70% < p s J0% 80% <P 90% 90% < p s 100% 100% Figure 4-1: Bayesian reasoning network for CHF SIGNS Links to internal & failure states a Links to DISEASES D observables SYMPTOMS Figure 4-2: Causation Diagram for a Diagnostic Network 21 ................................ Predisposition signs H ypertension Medication Compliance iternal failure state/disease Symptoms Coronary artery disease CHF Respiratory Fluid problem retention Figure 4-3: Causation Diagram for CHF 22 Chapter 5 Question Querying Mechanism Querying is a process by which a Bayesian reasoning network generates health questions and sends them to the user, via a personal digital assistant (PDA). Responses from the user are sent back to the network as evidence, which change the joint probability distributions in the network enabling it to generate the next set of questions. Health questions were carefully designed. These questions are ordered by the medical/variable categories they belong to, and locations of the user at the time of querying in the test environment. 5.1 Topology of a Querying Process The querying mechanism is illustrated in a flow diagram in Fig. 5-1. As discussed in Sec. 4.2, evidence entered in the network exerts different levels of impact on the probability of CHF. In a CHF domain, the degree of impact is measured by the percentage change of P(CHF)as we vary the evidence entered. For example, setting true to Skipped Medication category increases the value of P(CHF)from 1.96% to 2.77%, whereas setting absent to QAngina results in a P(CHF)lowering to 1.7%. The evidence nodes are attached to both Q_ and B-type variables to softly weight the incoming evidence, such that a single evidence entered will not have a significant impact on the network. Each evidence node contains 10 weighted states. Setting 0% < a < 10% in EQHighChol evidence node, for example, changes P(CHF) from 1.96% to 1.98%, reflecting an 1.01% change in P(CHF). This minute change in the numerical distribution of probabilities is necessary and is a vital part of querying process. When the user answers a health question, the response to the question becomes evidence and subsequently enters 23 Bayesian Reasoning Network generate Contextual t . go to next cate gory Categories ordered by 4 digree of imp:.ct select Context specific question not found Selection l display PDA Figure 5-1: Flow Diagram of a Querying Process into the network. An answer to one question should not have made a significant impact to the network. The user might be entering false responses into the network for a variety of reasons. He or she might feel rather frustrated one day and exacerbated the situation by falsely responding to the system. The assumption in this work is that if enough evidence is gathered after a sufficient amount of time, accurate responses will overwhelm the user's noise. A window period of 7 days is used for CHF detection. This window period was confirmed by several local medical experts in the field [75, 76]. Measurements taken during this window period are compared with average values taken from the week prior to this period. In other words, P(CHF)for an individual is assessed at any given time based all evidence gathered this week and the week prior to this time. In the following section, we describe the heuristic rules used by the system to arrive at a prognosis of CHF. 5.1.1 Querying Heuristics Querying heuristic involves taking all 7-day window of evidence for a particular medical category (or variable), and determining the severity of the user's medical state pertaining 24 to that particular variable category. 0% < a < 10% and EQHighChol are an example of the severity of medical state and the variable category it is in, respectively. Note that evidence node EQHighChol corresponds to the medical variable, QHighChol. For a particular medical variable, the heuristic rules used can be described in three steps as follows: 1. Using a window period of 7 days, the system determines a positive rate, a, or the number of positive responses over all responses entered into the system by the user. A positive response is the user's answering "yes" to a health question, and a negative response is answering "no" to a health question. If no response is entered for a health question, the system inputs a "none" in the database to signify the lack of response at the time. In this system, the total number of responses is calculated by counting the total number of "yes" and "no" responses over the window period. A positive rate, a, is therefore the number of "yes" responses in last week relative to all "yes" and "no" responses. Note that the number of "no" responses is defaulted to 5. That way, when there are no responses made in the last 7 days, a will not have a value of infinity. In addition when there is only one "yes" response and it is the only response over last week, it makes more sense that this "yes" response is softly weighted. In this case, a = 1/(1+5) = 0.17% instead of 100% if the number of "no" responses were defaulted to 0. 2. The numerical value of a corresponds to one of the eight states, i.e., 0% < a < 5% through 90% < a < 100%, in a given evidence node. For each Q_-type variable, these eight states are listed as follows: state 1 = 0% < a < 10% state 2 = 10% < a < 20% state 3 = 20% < a < 30% state 4 = 30% < a < 40% state 5 = 40% < a < 50% state 6 = 50% < a < 60% state 7 = 60% < a < 70% 25 (5.1) state 8 = 70% < a < 80% state 9 = 80% < a < 90% 10 = 90% < a < 100% state 3. We now define a biometric change rate, 3, which is a percentage change in the level of a biometric variable (B) over an averaged value from last week. For instance, say a person has had an average weight value of 146 lb from last week. On Tuesday this week, his weight as indicated is 1541b. The value of 3 at that point in time is (154 - 146)/154 x 100 = 5.19%. Mathematically speaking, / = (B - R)/B x 100 (5.2) where R signifies an averaged B over the course of a previous window period. As time progresses, the window period also progresses resulting in dynamic changes in average biometric values and #. The heuristic rule used for B-type variables is the same as the one used for Q.-type variables. 4. The process of setting a state in an evidence node is termed setting observation. After the system determines the states for both a and / and sets the states to be true, the effect results in numerical changes in the joint probability distribution of the network. For example, checking 20% < a < 30% state in EQ-Smoking results in a slight change in P(QSmoking) from 24.6% to 26%, changing P(CHF)from 1.96% to 2%. In summary, setting observation in an evidence node sends a small impact to the network. During user querying, this amount of change to the network is desired because we want to avoid situations where responding to a single health question can result in a big change in P(CHF). In the work, the amount of impact corresponds to how many positive responses (relative to total responses) have been entered by the user in the variable category. 5.1.2 Sensitivity of CHF to Observations At any given time, a Bayesian network is capable of generating an impact list of all variables in the network, ranking them by the order of impact to the target. In a CHF domain, the network picks a variable that has the biggest impact to CHF, and checks into a database 26 of questions to see if there are any questions that fall into the variable category. A set of variable-specific questions are then selected based on the user's location. For example, a home occupant might have developed an exertional dyspnea due to a predisposed coronary heart disease and high blood pressure in the last few years. He occasionally experiences difficulty breathing but these symptoms are mild. and steps into the kitchen. He gets up one day from the desk The system detects that the person is right by a kitchen counter and the variable QDyspnea shows up high on the impact list. There are many temporally relevant questions associated with QDyspnea but given the time is noon, a cooking-related question may be the more appropriate question to ask than ones that relate to other activities. For the category of Q.Dyspnea, location type kitchen, and the time is noon, the system then sends a QDyspnea and kitchen-specific question to him via a PDA: Do you sometimes have difficulty breathing when you cook?. The challenges in developing suitable health questions lie in the fact that questions must 1. Be medically informative and non-intrusive 2. Be minimally intimidating and annoying to the user 3. Make sense temporally The process of question generation goes on as the user's location is changed, i.e., locationspecific and temporally-relevant questions are generated when appropriate. Setting observation in an evidence node changes the probability distribution in the network. In particular, the change of probability for CHF is noted. This is measured in terms of the percentage difference in P(CHF) before and after the observation is made, as illustrated in Sec. 5.1. When the previous evidence is unset, P(CHF) returns to the original value in the network. We proceed to set evidence on a different evidence node and record the change in P(CHF). This process goes on until an impact list of P(CHF) changes associated with different nodes is obtained. This process is termed sensitivity test, which is used to measure the sensitivity of the target variable (CHF) as observations vary. Note that the impact list roughly corresponds to the relative clinical importance of key variables indicated in [21]; in the article, both orthopnea and dsypnea are ranked pretty high in terms of their clinical importance. Different responses to health questions contributes to different degrees of impact to the network, resulting in a varying P(CHF)predicted. From time to time, the system gathers 27 evidence from the previous 7-day window period and sets observation in the network. If the user fails to respond to any questions in the last 7 days, a associated with each node in the network is 0 (see Sec. 5.1.1). Thus, none of the states in the network will be set. P(CHF)can be higher or lower depending on the biometric change rate, or 8, at the time of computation. For example, a 3 of 31.5% for weight change corresponds to 30% < a < 40% in EB-WeightGain, and a 3 of 9.1% for blood pressure corresponds to 0% < a < 10% Fig. 5-2 graphically illustrates the probability distribution in the in EB-Hypertension. network as various observations are set. Note that P(CHF)= 3.18% in this case. EQSmoking EBHyportension TYPE ONE 0 TYPE TWO 0 TYPE THREE 100 TYPE FOUR 0 TYPE FNE 0 TYPE SIX 0 TYPE SEVEN TYPE EIlGHT 0 TYPE NINE 0 TYPE TENF EQ Drinking TYPE ONE TYPE TWO 0 TYPE FOUR 0 0 TYPE FIVE TYPE SIX TYPE SEVEN 0 0 0 0 TYPE IGHT TYPE NINE TYPE TEN TYPE ONE TYPE TWO TYPE THREE I I TYPE FOUR 0 1O 0 1007 0 a TYPE FIVE TYPE SIX TYPE SEvEN TYPE EIGHT TYPE NINE YPE TEN EQCAD 0 00Y TYPE TWO TYPE TYPE TYPE QSmoking Smoker 25.3 Zmi |, 74.7 -- NonSmnker 0 ONE T TYPE SEVEN TYPE EIGHT TYPE NINE TYPE TEN A Prsn91 ' HighCholest... 17.2 Im 82.1 -Normal 0 TYPE EIGHT TYPE NINE TYPE TEN QSklppedMed 31 1mj 68.9 26. 1amm --- -- Enn 0 0 TYP 0 0 _Ee SEEN snt A 0_Orthopnea 0_Dyspnea 01Present ===== EQDyspnysp TYPE ONE TYPE TWO (0TYP TYETHE 10TYPE TYPE FOUR 0TYPE 0 TYPE SEVEN 0 TYPE ENGHT TPNNE TYPE TEN C, QAngina sslse n 7.09 Exertional NonExertional 92.9 TPFVE TYESX - CongestiveHeartFailure 0 Yes No i 6.11 93.9 11 0 EQ_SkIppedi ed TYPE ONE C) TY PE TWO 0 TYPE FOUR TYPE FIVE TYPE SIX Prosent Absent _HighChol 0 11 B_Hypertension 87. F 0 0 0 0 0 TYPE TEN 0 TYPE FOUR 0 FfVE SIX SEVEN NINE Q Drinking 12.1 Alcolholic EE1NonAlcoholic 0 O TYPE EIGHT "TYPE a EQFHighChgr TY PE 0 £ Absent ONE TWO THREE TYPE FOUR TYPE FtVE SIX TYPE SEVEN TYPE EIGHT TYPE NINE TYPE TEN 861TYPE BWeightGain 7Sudden Progressive 7.31 92.7 0 0 1DO 0 0 0 0 0 0 0 TP IH NINE 14.7 1TYPE TEN 853TYPE EBWoightG ain TYPE ONE 0 TYPE TWYO 0 TYPE THREE 0 TYPE FOUR 0 TYPE FIVE 0 TYPE SIX 00 TYPE SEVEN 0 0 TYPE EGT TYPE NINE 0 TYPE TEN 0 EQ_Orthopnea TYPE n 0 TYPE ONE 0 TYPE TWO TYPE THREE 100 0 FOUR 0 TYPE FNVE TYPE SDX 0 01 TYPE SEVEN EQEdema TYPE TYPE TYPE TYPE TYPE TYPE TYPE TYPE TYPE TYPE ONE TWO THREE FOUR FIVE SIX SEVEN EIGHT NINE TEN Figure 5-2: Setting Observation in a CHF Network 28 0 0 0 0 100 0 0 0 0 0 0 II 5.1.3 Sensitivity of CHF to Setting Observations on Evidence Variables Whenever a response to a health question is made by the user, the network performs a sensitivity test to compute which of the next evidence states, when set among all current evidence, will result in the biggest change in P(CHF). The system asks the user a health question from the variable category. In brief, the sensitivity test obeys the following rules: 1. When 0% < a < 10% is already set in an evidence node, the system un-sets the evidence, sets 10% < a < 20% and measures the change in P(CHF). 2. When 90% < a < 100% is already set in an evidence node, the system un-sets the evidence, sets 80% < a K 90% and measures the change in P(CHF). 3. When neither 0% < a < 10% nor 90% < a < 100% is set in an evidence node, the system un-sets the evidence, goes to the state before and after it, and measures the changes in P(CHF). Thus, for a system with 11 variables (hence, 11 evidence nodes) we can have an impact list that can involve as many as 22 items after the sensitivity test. Using the previous example as illustrated in Fig. 5-2, the result of a sensitivity test gives Change in P(CHF) = 67.30% EQ.Dyspnea = 10% < a < 20% Change in P(CHF) = 66.67% EQ-Orthopnea = 10% < a < 20% Change in P(CHF) = 57.70% EQ-Edema = 10% < a < 20% Change in P(CHF) = 18.24% EQSmoking = 10% < a < 20% Change in P(CHF) = 17.92% EQAHighChol = 10% < a < 20% Change in P(CHF) = 14.29% EQDyspnea = 20% < a < 30% Change in P(CHF) = 14.05% EQ-Orthopnea = 20% < a < 30% Change in P(CHF) = 12.40% EBWeightGain = 40% < a < 50% Change in P(CHF) = 12.26% EQDrinking = 20% < a < 30% Change in P(CHF) = 10.92% EQEdema = 30% < a < 40% Change in P(CHF) = 10.06% EBWeightGain = 40% < a < 50% Change in P(CHF) = 9.75% EQSkippedMed Change in P(CHF) = 8.81% EQCAD = 20% < a < 30% Change in P(CHF) = 8.18% EQ Angina = 20% < a < 30% Change in P(CHF) = 1.55% EQ-CAD = 30% < a < 40% 29 = 20% < a < 30% Change in P(CHF) = 1.26% EBHypertension = 20% < a < 30% Change in P(CHF) = 1.24% EBHypertension = 30% < a < 40% Change in P(CHF) = 1.24% EQSmoking Change in P(CHF) = 0.93% EQ-Drinking = 30% < a < 40% Change in P(CHF) = 0.93% EQHighChol = 30% < a < 40% Change in P(CHF) = 0.63% EQSkippedMed = 30% < a < 40% Change in P(CHF) = 0.31% EQAngina = 30% < a < 40% ... 30% < a < 40% = etc. This list tells one that, for example, setting observation (in this case, setting 10% < a < 20% to be true) in EQDyspnea will cause the biggest change in the value of P(CHF). Therefore when the system prompts the user with a question next time, it would ideally come from the dyspnea category-since the response to this question provides the most powerful evidence against CHF. The next section discusses the sorting mechanism involved in making use of this impact list and information about the user's location. 5.2 Question Sorting Mechanism In a question sorting process, the system decides which question to ask next based on three attributes: 1. Medical/variable category 2. Location of the user 3. Time in which the querying is involved When the first attribute is selected from a sensitivity test and the second from a location tracker that tracks the user's whereabouts in a room, the system goes into an external database of questions, searching for those that have all these attributes. For example, given that EBHypertension is the top-most category on an impact list and the user is in the kitchen at 6PM, the system selects a EBlHypertension, kitchen-specific, and temporally relevant question to ask. The temporal element is time, and at 6PM, a relevant context for the question might be the cooking activity. The user's subsequent response to this question updates the network as it gathers all evidence and performs a sensitivity test again. The 30 querying loop continues this way as more and more information about context are absorbed by the system over time. 5.2.1 Question Attributes Each question is created based on these attributes: 1. Having three possible answers: "yes", "no", and "later". A "later" response allows the user to discard or postpone responding to the question if it does not apply. If the user is not around answering the question or fails to answer it, an input of "later" is set by default. These brief and qualitative responses simplify the network by reducing the amount of states and hence conditional probabilities needed to construct the network. 2. Location-specific: each question is different in that it is dependent on the user's whereabouts in the room. 3. Temporally relevant: each question can be temporally relevant and only involves context specific to the time in which the question is asked. 4. Clinically significant: Questions are carefully screened by a medical expert in the field, making sure that they assume the right clinical value and importance. Appendix C illustrates how questions are categorized and stored in a remote question database, and give examples of them. 5.2.2 Question Display Mechanism and Cycles This section describes the mechanism in which questions are displayed to the user on Palm Pilot, and how questions are selected and cycled through in the querying process. 1. When the user enters and moves around the room, the system prompts the user with location-specific questions given the time. The user can choose to answer them or not. When left un-answered, these questions disappear whenever the user moves to another location. The un-answered questions will recycle allowing for the user to respond again. 2. The system determines a health category and randomly picks a question from the category based on the user's location in the room, and the time of the day in which 31 the question is asked. If location-specific questions cannot be found, or if locationspecific questions exist but none of them is time-specific (for example, to be asked at 8PM) in the category, the system will select a second health category from the impact list and do the search again for another question in the database. The process goes on until a question is selected, then it is displayed to the user. 3. Each response to a question is softly weighted in the network, so that a response by the user will not have significantly changed the probability distribution in the network. This allows for questions in the same category to be displayed multiple times (if the user enters into the same location next time, the question still exists given the time, and the randomizer selects it again) before the next category shows up first on the impact list, enabling the user to change mind and respond differently next time when the same question appears. 4. After the system has gathered enough evidence (in terms of the user's responses gathered) for a particular medical category, the system may display the next question from a different category. The querying cycle goes on as before. 5. There will be only one question displayed per location at any given time. When the user moves to the next location and the category is still the same, a different, temporally-related question will show up because every question in the database is location-specific and temporally different. 32 Chapter 6 An Warning System: A Graphical Demonstration This chapter describes additional details of the demonstration system. An approach was taken in this work to incorporate both biometric data and context gathered on an hourly basis. The system processes this simulated data set into a medical history chart, as shown in Fig. 6-2. The real responses from the user are incorporated into this medical chart, allowing for dynamically generating P(CHF) and enabling the system to computing a CHF risk factor. 6.1 Description of the System 1. Biometric dataset was obtained from a local medical expert, who suggested speculative biometric datapoints to reflect what a CHF patient might have had in a real situation [75]. In our test case, we assumed the home occupant, Mr. Bad Habits, is a 63-year old male, who has been experiencing breathing difficulty and having swollen joints for nearly 2 weeks. At any given point in time during a particular week, the system uses datapoints from the last week as basis (to calculate average biometric values), to assess a dynamic risk factor for Mr. Bad Habits who might be developing CHF. At the end of the second week, Mr. Bad Habits was sent to the hospital for emergency treatment in CHF due to an escalating risk factor and accumulating evidence. 2. The user can enter responses into the system via a Palm Pilot device. An example of 33 its display is depicted in Fig. 6-1 Note also that we do not expect the user to have Figure 6-1: Question Displayed on a Palm Pilot Vx answered all the questions on a Palm Pilot. Some questions might not be appropriate; sometimes the user might miss the questions; sometimes the system may not generate questions due to inappropriate time or location. Hence, we expect that the responses be loosely populated throughout the medical chart. In the case of Mr. Bad Habits, who turns out to be a person attentive to his Palm alert, he averages 10-12 responses per day answering health questions via his Palm Pilot. We expect this number to be 34 less or more depending on the user's habit. 3. Health questions are sent only when the system detects that there is a location and time-specific health question in the database that needs to be asked. Since not all questions are location and time-specific, Mr. Bad Habits will not be asked questions all the time. Note that in a home setting, the system database may not have questions pertaining to all the time and locations (namely fireplace or hall). For example, some questions might be only used between 12PM to 12AM and when the user is in the kitchen. For this reason, questions will be asked of a home occupant sparsely throughout the day. Hence, we might be expecting even lesser responses than questions since the user might not be around answering them. In an extreme case, the user might forget his Palm when leaves the house for the day, therefore answering no questions. 6.1.1 The Graphical Interface A graphical interface was developed to display the data used by the system. Biometric data from the input database are depicted on the second to top, starting with SkippedMed to HighChol. For each question variable we can have either yes, no, or none as responses. The system displays both yes and no as bars shown in Fig. 6-2 for Mr. Bad Habits; upward bar signifies the response yes and downward bar the response no. If more than one yes or no responses are entered, the system display them by elongating the bar proportionally. For example, two yes responses entered in an hour will display a upward bar that is twice the length of a regular yes -type bar. At the same time as Fig. 6-2 is displayed, Fig. 6-3 dynamically highlights the evidence node that belongs to a variable category, from which a health question is to be displayed on the Palm when the system is set in a querying mode. In this case, EQOrthopnea is highlighted. The user can move the position of this window period by means of a slider bar on the bottom of the interface. As the bar is moved across the display, both simulated contextual and biometric data for Mr. Bad Habits is revealed on the screen. At this time, P(CHF) is dynamically generated based on evidence collected from the 7-day window period, using a set of heuristic rules mentioned in Sec. 5.1.1. This dynamic generation of P(CHF) can be seen in Fig. 6-4. Note that from Day 9 (D-9 in figure) onwards, the blood pressure 35 Figure 6-2: Display of Both Contextual and Biometric Data 36 Owd -",7-, itpay"Sim". ne4soming Metwod. fai Congestive Hem Failute QE 9 rinking QDrnklng E rmtkIng Q..Sifolking BSypertension QXAO E.Anqin& QE IppdMed Q pna n rtiipnea B dgt*Gain trtopnnm F EeIgh*GWIn ;t~~'~ Figure 6-3: Dynamic Highlighting of QOrthopnea Question Variable in a CHF Network 37 Figure 6-4: Display of Both Contextual and Biometric Data as A Slider Bar Moves readings drop indicating a weakened heart that is unable to generate enough pressure to circulate blood throughout the body. This fact is coupled with the evidence that Mr. Bad Habits has been experiencing a lot of CHF-related symptoms lately (difficulty breathing, chest pain, and swollen joints). The results indicate that the P(CHF)threshold level of 50% is reached by the end of Day 10 (D-10); the risk factor computed approaching the end of Day 10 suggests that Mr. Bad Habits is in need of a medical attention then. Prognosis is depicted as P(CHF)in Fig. 6-2. A threshold level was set at 50%. Depending on how sensitive the system is, which can be tested against real samples of patient data, this threshold level can be adjusted such that the level appropriately reflects the onset 38 of a medical emergency. 39 Chapter 7 Discussion Telephonic-linked care (TLC) and disease management programs are becoming prevalent. The former is provided through follow-up phone calls by nurses or others to actively query patients of their health conditions and remind patients to take their medications. These calls are done easily and may improve the patient's health and overall adherence to medications. Disease management programs improve outcomes and, on the other hand, are feasible in large practices or HMOs. However, in spite of the reduced costs and enhanced outcomes associated with these preventative approaches in telemedicine, there are still many obstacles that inhibit its expanded use. First of all, few of the commercial innovations involve medical diagnosis using reasoning in artificial intelligence (AI). One reason is medical liability. In a clinical diagnostic situation, one false negative can result in the death of a human being. When that happens, the consequence of medical malpractice lawsuit can stymie years of active research resulting in a huge financial setback for those involved in the process. Second, there has been a slow growth in telemedicine due to reimbursement, confidentiality, and health care professionals' reluctance to change [70]. This work tests a first step for the framework development of a home-based early warning system. The proposed system may be a combination of a disease management program, telephonic link care, and medical diagnostic systems. By combining both biometric data and relevant medical information about the home occupant over time, the system dynamically predicts the onset of a serious medical condition and alerts the medical authority when an emergency situation arises. Unlike a commercially available disease management program or a TLC program, the proposed system reaches a diagnosis by applying diagnostic reasoning 40 and combining both biometric data and the ability to query relevant medical information from the user. Relevant medical information are ascertained via the user's periodic responses to health questions. There are several research challenges that still face the development of a more advanced home-based system. The first of which is creating a more optimized Bayesian network that achieves a high level of accuracy in prediction. This could be done with supports from both the engineering and medical team of experts. Different networks can be created to tailor different types of medical conditions, such as arrhythmia and hypertension. There are also more advanced techniques in artificial intelligence, such as pattern recognition and knowledge representation, that may be applied to a Bayesian system. One of a successful example is the use of both differential analysis and pseudo-Bayesian reasoning in Long's Heart Disease Program [10, 26, 32]. Another challenge is creating a database of medical questions that satisfy a set of attributes described in Sec. 5.1.2. As more responses are entered into the system, the more of the user's noise may be reduced in midst the accumulating evidence. Each question is asked based on the user's whereabouts in a room and the time. There could be a lot of questions asked or a few per day, depending on the time of the day and the occupant's location. A goal of the early warning system is to prevent information overload, or providing too much information to the user over time. A research study about the people's behavioral patterns in a home might provide invaluable insights to researchers, when creating medical questions to be stored in the system. The biggest challenge yet is perhaps finding acceptance for use of such system in the medical community. Law makers and medical insurance companies will have to overcome issues concerning reimbursement, confidentiality, and liabilities to make the practice of telemedicine more widespread and acceptable. Lastly, the proposed early warning system will have to be proven effective after years of case studies are conducted before gaining penetration into the community. 41 Chapter 8 Suggestions for Future Work 8.1 Feedback Control In light of the future development and enhancement of this early warning system, the following add-on component is proposed. By adding a feedback control loop to the existing system, such as shown in Fig. 8-1, system performance might improve. I- Monitoring & Analysis Adjustment & Control User Activities 0 Location Tracking System Bayesian Reasoning Network Inner Feed-Back Process User Response End Point Medical Emergency Outer Feed-Forward Process Figure 8-1: A Feedback Closed-Loop Design for an EWS The control design illustrated becomes a closed-loop system, where the controlling element (feedback loop) on top monitors, controls, and makes adjustment to the controlled system below until desired output is produced. The degree of adjustment to the system is measured by errors in the output of the system. The controlling mechanism should be designed with the abilities to control the user interaction. This requires a more improved way of acquiring and recording information and, 42 besides knowing where the user is currently at and what the user is doing, a comprehensive model of how those data can be analyzed to perform adjustments to the system is also needed. With the advent of tracking and sensor technologies, it is possible that a variety of sensors will be integrated to created a feedback mechanism, monitoring a person's health through their eating habits, sleeping patterns, usage activities, to name a few. It is expected, for example, that people's eating patterns can be measured through sensor tags in the house. This knowledge about individual can be used to update and make adjustments to the reasoning network, making it more reliable the next time the process is instantiated. 8.2 Factor Analysis Discovering underlying or latent relationships among variables gives scientists the knowledge of which variables are related and to what extent. The multivariate statistical model used in determining the relative importance of variables and such relationship is called factor analysis [31, 28, 27, 29, 30]. First, using correlation or covariance the analysis uses a smaller number of unobserved, or latent, variables to explain the observed relationships among a set of observed variables; this smaller number of variables are used to find a meaningful structure in the observed variables. Second, the use of analysis leads to data reduction. This is because many observed variables can be represented in terms of a smaller number of latent variables. In our CHF model, relevant health questions are grouped into the respective categories. For example, questions related to swollen toes are grouped into the "Lower-Extremities" category of the variable node called "Swollen Joints". Factor analysis generates rationales for grouping questions into even more relevant categories, creating nodes for these categories, and making questions children of them. 43 Appendix A Probability Theory Probability theory allows for assessment of degree of belief in propositions from real-world problem domains. The theory assigns a numerical degree of belief between 0 and 1 to various propositions. For example, say the chance that the person will have high blood pressure (HTN) is 0.03. This is called the prior probability, and the percept in this case is having high blood pressure. We therefore say that the distribution for the aforementioned proposition for hypertension is {X for hypertension, P(x = HTN) hypertension, P(Y = -HTN) = = 0.03, Y = 0.97}, which means that the prior probability 0.03, and the complement probability' for not having = 0.97. Each proposition possesses all possible outcomes in a so-called probability distributionthat sums to 1. Mathematically, a probability distribution is described as P(XI(), where X is a random query variable that is both discrete and continuous, such that X C {X 1 , X2, the other hand, ... , Xn}. On represents the background state of information deemed as evidence to X. Note E_ 1 P(xi) = 1 and P(xi) > 0. Moreover the rules of probability state that Product chain rule = P(YIX)P(X) (A.1) P(Y, Xi) = P(Y, X) + P(Y, Y) (A.2) P(X,Y) = P(XIY)P(Y) Marginalization rule n P(Y) = 'The complement of a variable is indicated by the sign , 44 where X = binary. A.1 Probability Distribution A joint probability distribution gives a specification for all the probability assignments in the domain, i.e., P(X1 , .. , X,,). When evidence is obtained for the variables, the effects . for setting the evidence in the network propagate throughout the domain changing its joint probability distribution. In this case, prior probabilities become conditional or posterior probabilities. Expanding joint probability distribution linearly yields a conditional probability and a smaller conjunctive probability, which can be further reduced to still a smaller conjunction via chain rule, i.e., P(X1 , ... , Xn) = = P(XnjXn_ 1 , ... , X 1 )P(X_ 1-, - ,X 1) P(XnIX-1, -- , X 1 )P(Xn 1 jXn- 2 , -- ,X1) n = P(Xi1Xi_1, . . . P(X2 IXI)P(Xi) Xi) n - P(XiIPa(Xi)) (A.3) i=1 where Pa(Xi) is the parent set such that Pa(Xi) C {Xi_ 1 , - -- , X, A.1.1 Bayesian Inference An inference-based system computes the posterior probability distribution of one or a set of query variables, when values of the evidence variables are known to the system. A domain agent or the system receives values for the evidence variables from its reasoning tasks, and decides which next action to take based on all concurrent values of non-evidence variables. Bayesian theory allows for inference computing, or calculation of the posterior probability distribution, of any causal variable given all other probabilities in the domain. In a medical diagnostic reasoning, the diagnosis of B based on a collection of evidence A, can be mathematically described as P(BIA) = P(AIB)P(B) P(A) 45 _ P(A, B) P(A) (A.4) where A can also be deemed as symptoms of the disease. The inference computation is possible when causal relationships in the network are well-specified and all conditional probabilities are known. Given that each node on the graph is associated with a variable, Xi, each Xi is said to be associated with some of its parent variable(s), or Pa(Xi). The probability of the variable, Xi, conditional on its parent variable(s), Pa(Xi), is called conditional probability distribution, otherwise represented as p(XilPa(Xi)). In the domain graphically represented by a Bayesian network, a joint probability distribution gives a specification for all the probability assignments representing the domain mathematically, i.e., n P(X1 ,... ,Xn) = JJp(XiIPa(Xi)) (A.5) i=1 Comparing Eq. A.3 with Eq. A.5, we see that P(Xil~i) = P(XilPa(Xi)) (A.6) The variable set {1,-- - , fn } corresponding to the parent set {Pa(X1), -- - , Pa(Xn)} (A.7) specifies the connecting arcs in the network. Eq. A.6 implies that in a correctly constructed Bayesian network each variable, Xi, when given its Pa, is conditionally independent of its predecessors in the variable ordering [8]. A.2 Bayesian Network: Attributes Directed Acyclic Graph The relationships among variables in a network are represented in a directed acyclic graph (DAG), which describes the domain and contains a collection of prior and conditional probability distributions [8]. A set of directed links or arrows connects the variable nodes acyclically, which is a direct result of the mathematical formalism to prevent inconsistent probability representation in the network [32]. The origin of each arrow is a parent node, which has a direct influence on a child node that the arrow points to. The degree of influence is quantified (given the 46 evidence) in conditional probability tables (CPT's) in nodes, i.e., DAG encodes assertions of conditional independence in CPT. Table B.1 depicts an example of the conditional probability table (CPT) in the domain of CHF. Conditional Independence The network gives a concise specification of joint probability distributions. However, Bayesian network can become enormously complex due to the number of nodes and links that corresponds to the number of entries in CPT. Fortunately, a complete network specification is made concise by the fact that probability of an effect is only dependent on the state of its immediate causes, as seen in Eq. A.6. For binary nodes the probability can be computed using the noisy-OR assumption, which is a generalization of the logical OR, generalizing to any number of evidence inputs that assume to have only made independent impact to the nodes [32]. For example, if a child node Z has two parent causes, X and Y, with P(ZIX) = x and P(ZIY) = y we say that P(ZIX&Y) = 1 - (1 - x)(1 - y) = x + y - xy. 47 Appendix B Construction of a Bayesian Network In designing a logical causal network using Bayesian reasoning that satisfies conditional independence, we note that some of the variable nodes in this network can have more than two states. For example, "Sudden Gain Per Day" consists of the variable state set {ONE LB, TWO LB, THREE LB}. We identified a list of relevant variables for CHF and added to the network that describes the domain. Some prior probabilities were obtained based on statistics: Congestive heart failure There are about 4,700,000 Americans (2,300,000 males and 2,400,000 females) with CHF [23]. According to the U.S. Bureau of the Census, the resident population of the United States in May, 2001 is about 284,130,000 [24]. This gives the probability of a person having CHF in the U.S. to be 0.84%, i.e., P(CHF) = 0.0084, P(-,CHF)= 0.9916 (B.1) Hypertension According to the same source [24], 1 out of 5 Americans have high blood pressure. This makes P(HTN) = 0.20, P(--,HTN) = 0.80 Moreover, of all the CHF cases 75% of those have antecedent hypertension. 48 (B.2) This makes P(-,HTNICHF) = 0.25 P(HTN|CHF) = 0.75, (B.3) Angina Pectoris About 6,400,000 Americans have chest pain [23]. This makes P(AP) = 0.026, P(-,AP) = 0.974 (B.4) Dyspnea Breathing difficulty in cardiac patients results from inadequate oxygen delivery to the tissues; and pulmonary dyspnea or conditions associated with a heightened respiratory drive, altered pulmonary mechanisms, or gas exchange abnormalities. Approximately 14 million Americans suffer from chronic obstructive pulmonary disease (COPD). Another 10 million citizens (approximately five percent of the population) have asthma [37]. Assuming a conservative figure by counting only the COPD and asthma case, the probability of dyspnea in the U.S. is P(,Dyspnea)= 0.904 P(Dyspnea) = 0.096, (B.5) Coronary artery disease About 7,300,000 Americans are suffering from myocardial infarction (heart attacks), yet another 6,400,000 have angina pectoris that is due to CAD [23]. This makes P(-,CAD) = 0.945 P(CAD) = 0.055, (B.6) Smoking About 25,870,000 men and 22,830,000 women over the age of 18 in the U.S. are smokers. An estimated 4,100,000 adolescents ages 12-17 are smokers [23]. This makes P(-,Non - smoker) = 0.753 P(Smoker) = 0.247, (B.7) High cholesterol It is estimated that 40,600,000 Americans have total blood cholesterol levels of 240 milligrams per deciliter (mg/DL), which is considered to be the at-risk level [23]. This figure makes P(HighCholesterol)= 0.162, P(-,NormalCholesterol)= 0.838 49 (B.8) B.1 An Example: Conditional Probability Table For an simple network that consists of two initial nodes (Fig. B-1), CHF and HTN, we have where CHF E {yes,no}, and HTN c {yes,no}. Eq. A.1, Eq. A.2, and Bayes rule -- --- HTN CHF Figure B-1: Two-Node Bayesian Network in Eq. A.4 were used. For brevity, resulting probabilities are summarized in the following CPT: Table B.1: Discrete probabilities for CHF-HTN Variables P(CHF,HTN) P(CHF) 0.0084 P(-,CHF) 0.9916 P(HTN) 0.2 P(-,HTN) B.2 P(HTNICHF) HNCHF CHF P(HTN) 0.75 P(-,HTN) 0.25 0.8 0.1953 0.8047 Network Topology A Bayesian reasoning network consists of nodes and links, which suggest what CPT's must be determined. In constructing a network nodes are usually added in the order of causality. The "root" causes are added first then the variables they influence. In general to maintain the properties of a DAG and conditional independence, the procedures for incremental network construction are described in [331: 1. Choose all relevant variables (Xi) that describe the probability domain 2. Choose the ordering of variables. 3. Add variables to the network according to the ordering, and (a) Set parents nodes (Pa(Xi)) to a minimal set of nodes in the net such that conditional independence assumption holds 50 (b) Define CPT for Xi 51 Appendix C Database C.1 Output: Database of Questions We have defined each health question to be a function of both the user's current and time (hour 1 to 24) in which the question can be asked. Questions are stored a tab-delimited file in the following manner, as shown in Fig. C-1. Figure C-1: Question Format For example, some questions are EQHighChol morning? Have you made yourself any greasy food dishes since this kitchen EQSkippedMed window 8 12 22 Did you forget to take your scheduled medication today? 24 --- etc. For example, the EQHighChol-specific health question Have you had any greasy food since this morning? might be asked if the user is in the kitchen at noon. The querying time frame for this question can be somewhere between 12PM to 10PM (hour 22). The question associates food in the context of cooking in the kitchen, as might have happened at 12 noon to some people, and high cholesterol intake in the context of eating/making high-caloric food. 52 C.2 Input: Database of Contextual and Biometric Data The data were saved in a comma-delimited ASCII format in the same way as the questions were saved. The following example depicts the storage format for storing the information. Mr. Bad Habit,Day,20 Hour,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24 B-WeightGain, 158,158,157,158,157,158,157,157,157,157,157,158,158,158,157,158,158,158,158,158,158,158,158,158 B-yypertension,124,124,122,122,122,122,121,121,120,120,119,118,118,118,118,118,118,115,115,115,114,113,112,n111 Q-SkippedMed,none,none,none,none,none,none,none,none,no,none,none,none,no,none,none,noneyes,none,none,none,none,none,none,none Q.Orthopnea,none,none,none,none,none,none,no,none,no,none,none,non,none,none,none ,non none,none,yesInone ,noneyes,none,none Q-Dyspnea,none, none, none, none, none, none, none, none, none, yes,nons, none, none, none, yes, none, none, none, none, none, yes, none, none, none Q.Angina,none, none, none, none,none~none,none,yes,none,none,none,none,none,none,none, none, yes, none. none, none, none, none~none,none Q-dema, none, none, none, none, none,none,none, none, none,none, none, yes, none,none,none, none,none, none, yes. none,none,none, none,none Q.Drinking, none,none, none, none, none, none, none,none, none, none, none, none, none, none, none, none,none~none, none, none, none, none,none none Q.Smoking,none,none,none,none,none,none,none.none,none,nonenone,none,none,none,none,none,none,none,none,none,none,none,none, none Q.HighChol,none,none,none,none,none,none,none, none, none, none,none,none ,nonone,yes, none, none,fnone,none,none,none,none,none,none This example depicts the input history and biometric data obtained for Mr. Bad Habit on the 20th day. In this database, we employed the same format to store 21-day worth of data, e.g., data for the 20th day sit between data for the 19th and 21st day. After the name row is the hour row. There are 24 hours per day; they are numbered from 1 to 24. The system does analysis every hour on the hour. Both biometric variables and their data are shown on top, and question variables and data follow. 53 Appendix D Software Implementation of a Bayesian Network D.1 Tools and Methods The CHF reasoning network is implemented in, along with Java, a simulation program that models the functionality of an early warning system. This thesis employed existing software packages, made for Bayesian inference computation, that are either commercially available or public domain: 1. We constructed and test-validated a Bayesian network structure for CHF using sensitivity to findings feature in a commercial software package, Netica 1.12 for Windows 95 and NT 4.0, by Norsys Software Corp. Netica is easy to use and provides graphical visualization of priors for variables as seen in Fig. 4-1. 2. We then used open-source Java-based application program interface (API) for JavaBayes to create a similar network. In addition to producing inferences, expectations, and explanations for a Bayesian network, JavaBayes allows for creating and subsequently modifying any arbitrary network graphically, saving results into a Java filewhich can then be compiled into an input class file for loading into another Bayesian inference engine. App. D gives an example of how the network is described in a Java file. Fig. D-1 depicts a graphical view of the network in a modified JavaBayes application environment. Written entirely in Java programming language, the system was built atop a Java-based inference engine called EBayes (Embedded Bayesian 54 BayesianReasonin Network tot Congestive Heart Failure A 0 k EQSmoking EQ..Drinkin! EBHypertension QQ inking EQHighChol § - oking QCAD EQCAD Q EQ Angina B_ ypertension Q i9hChol 0 EQfkippedMe 0 nina CQ o Q kippedMed QK ema spnea Q 0 p EQDyspnea Qg h'opnea 0 EQ-Edema BSeightGain 0 EQOrthopnea Eg eightGain 7 a a Figure D-1: Bayesian Reasoning Network for CHF Using JavaBayes Package Networks), which is essentially a dressed-out version of JavaBayes meant to be implemented into smaller embedded systems. In light of creating a de-centralized and more distributed computing environment in the future, EBayes software package can be a great tool in consideration. Lastly, we wrote server and client programs to enable remote communications between a PC server and a personal digital assistant (PDA, in this case, a Palm Pilot Vx that came equipped with an OmniSky wireless modem). This client-server setup enabled the user to use PDA to directly interact with the network in an early warning system. D.2 CHF Network Representation in Java One can construct an arbitrary network using JavaBayes and saves it in the corresponding Java file, which can be compiled into a class file readable by EBayes. Information about 55 Table D.1: Conditional Probability Table for CAD-Angina Link QCAD True True False False QAngina True False True False Conditional Probabilities 0.99 0.01 0.7 0.3 a Bayesian reasoning network can be read by EBayes directly from a Java class file, which essentially contains the positions of nodes with associated priors, and links with conditional probability tables. Once compiled, the network can be automatically loaded into the EBayes engine which runs on all major operating systems that supports Java Virtual Machine. Information about parents and children in a network are defined in discrete probability functions such as DiscreteFunction p6 = new DiscreteFunction( new DiscreteVariable[] {QAngina}, new DiscreteVariable[] {Q-CAD}, new double[] {0.99, 0.7, 0.01, 0.3}); In this case, QAngina the sixth node in the ordering of variables and has QCAD as its parent. Conditional probabilities for QAngina, when conditioned upon QCAD, are listed last. The CPT contains 2 x 2 = 4 values, representing binary states in both variables. In terms of CPT, we list the conditional probabilities for the CAD-Angina link in Table D.1. Individually speaking, a variable is stored as follows: DiscreteVariable CHF = new DiscreteVariable("CHF", DiscreteVariable . CHANCE, new String[] "False","True"); which connotes that the variable CHF has both True and False states, which are random or by "chance". 56 Appendix E CHF Diagram iad~ ~. ~ GnI Edendof dypneS n f orthopnes, 6"ens ee Knrwn /presumed d-Womensation of CHF d-cey .. hwha, n no Yes Madvcallon no ailure ilky ' Pdn.pd.y .'do- puy. et '-ngh 'oi gy ofh of database d. ffvrnens.Uion a database '-rIn of howrl failure Lgth Sdecomp-nation forheart mdication-s filur. ead j swgdeee sn'soln ad n de mas n Yes Chest pain A.H~ties # -qwasfy. Pre ne (La. livng and 0d4nt of of daily bffwe do)t omt Comi and type of o-xding Prexenioe loosw. 9) yo # ol prior hosializatiorts for heartfailure -clWight aos es change In aei Did ryrorna #We socia Y. Historyof see smoling and Ye lmhol use Pzyoho history of vsght database a- Past dtohg.cardiao dsease database wekesand faigability rnedia - ond Mans pafwv. no myo"wmww - AmpIfthtt agtso Uniaf) DONE Figure E-1: Dataflow Diagram of Critical Variables for CHF 57 Appendix F Glossary of Medical Terms F.1 Abbreviations AI Artificial intelligence CAD Coronary artery disease CHF Congestive heart failure HTN Hypertention MI Myocardial infarction F.2 Glossary Chronic Long term, not sudden Coronary heart disease Also known as coronary artery disease (CAD). This category includes acute myocardial infarction; other acute and subacute forms of ischemic (coronary) heart disease; old myocardial infarction; angina pectoris; and other forms of ischemic heart disease [23] Orthopnea An abnormal condition in which a person must keep the head elevated (sit or stand) to breathe deeply or comfortably Dyspnea Term usually applied to sensations exprienced by individuals who complian of unpleasant or uncomfortable respiratory sensations. Difficult, labored, uncomfortable breathing Lower leg edema Abnormal interstitial fluid accumulation in lower legs 58 Hypertension A condition having high arterial blood pressure persistently. In the case of adults, this is identified as a blood pressure reading exceeding 140/90 mm Hg. The condition may be associated with other disorders, known as secondary hypertension, or may have no single identifiable cause (essential or primary hypertension). Individuals affected with hypertension are at risk for heart disease, stroke, and kidney failure Medication Compliance Willingness to follow a prescribed course of treatment or adherence to a drug regimen as in taking medications correctly and on time Morbidity Measures of both incidence and prevalence rates; measures of various effects of disease on a population Myocardial infarction Heart attack Pathology study of disease processes Prevalence the total number of cases of a disease existing in a population at any given time Prognosis forecast of the outcome of a disease or therapy Nocturia excessive urination at night Angina A sharp thoracic pain accompanied by a feeling of suffocation, due to a lack of oxygen in myocardium. This is typically brought on by exertion, stress, or excitement. Symptoms include choking or constricting pain, especially in the case of angina pectoris 59 Bibliography [1] Kathleen MacNaughton. observation: Website http://www.healthcare- informatics.com/issues/1997/02-97/hathleen.htm. Date Visited: January, 2001 [2] M. Wolz, et al. "Statement from the National High Blood Pressure Education Program: Prevalence of hypertension." American Journal of Hypertension, 2000 13:103 [3] T. Pickering. "Recommendations for the use of home(self) and ambulatory blood pressure monitoring." American Journalof Hypertension, 1996 9:1-11 [4] P. Cantillon, et al. "Patient's perceptions of changes in their blood pressure." Journal of Human Hypertension, 1997 11:221 [5] I. Enstrom, et al. "Difference in blood pressure, but not in heart rate, between measurements performed at a health center and at a hospital by one and the same physician." Journal of Human Hypertension, 2000 14:355 [6] E. O'Brien, et al. "Use and interpretation of ambulatory blood pressure monitoring: recommendations of the British Hypertension Society." British Medical Journal, 2000 320:1128 [7] J. S. Breese and D. Heckerman. "Decision-theoretic troubleshooting: a framework for repair and experiment." Technical Report MSR-TR-96-06, Microsoft Research, Advanced Technology Division Microsoft Corporation, Redmond, Washington [8] D. Heckerman. "A Tutorial on learning with Bayesian networks" Technical Report MSR-TR-95-06, Microsoft Research, Advanced Technology Division Microsoft Corporation, Redmond, USA 60 [9] M. Genesereth. "The use of design descriptions in automated diagnosis" Artificial Intelligence, 1984 24:311-319 [10] W. Long. "Temporal reasoning for diagnosis in a causal probabilistic knowledge base." Artificial Intelligence in Medicine (1996) 8:193-215 [11] G. 0. Barnett, J. J. Cimino, J. A. Hupp, et al. "An evolving diagnostic decisionsupport system." Journalof the American Medical Association (1987) 285:67-74 [12] M. B. First and R. A. Miller. "QUICK (QUick Index to Caduceus Knowledge): Using the Internist-1/Caduceus Knowledge Base as an electornic textbook of medicine." Computers and Biomedical Research (1985) 18:137-165 [13] R. A. Miller, H. E. Pople, and J. E. Myers. "INTERNIST-1: An experimental computer-based diagnostic consultant for general internal medicine." New England Journalof Medicine (1982) 307:468-476 [14] T. A. Russ. "Ventricular Arrhythmia management: a knowledge-based approach." Master's thesis, Massachusetts Institute of Technology, May 1993 [15] D. Heckerman and M. P. Wellman. "Bayesian networks" Communicationsof the ACM (1995) 38(3):27-30 [16] E. Charniak. "Bayesian networks without tears." Al Magazine (1991) 50-63 [17] S. Andreassen, M. Woldbye, et al. "MUNIN-A causal probabilistic network for interpretation of electromyographic findings." Intl. Joint Conf. on Artificial Intelligence (1987) 366-372 [18] C. Kahn, L. M. Roberts, et al. "Preliminary investigation of a Bayesian network for mammographic diagnosis of breast cancer." Am. Med. Informatics Assoc. Conf (1995) 208-212 [19] D. E. Heckerman. "An empirical comparison of three inference methods." Conf on Uncertainty in Artificial Intelligence (1988) 158-169 [20] B. N. Nathwani, D. E. Heckerman, et al. "Integrated expert systems and videodisc in surgical pathology: an overview." Hum. Pathol. (1990) 21: 11-27 61 [21] C. M. Ashton, D. H. Kuykendall, et al. "A method of devleoping and weighting explicit process of care criteria for quality assessment." Medical Care. (1994) 8:755-770 [22] M. Bondmass, N. Bolger, et al. "The effect of physiologic home monitoring and telemanagement on chronic heart failure outcomes." Advanced Nursing Practice (2000) 3(2): The Internet Journal of March 27, 2000. Website observation: http://www.ispub.com/journals/IJANP/Vol3N2/chf.htm. Date visited: May, 2001 [23] American date." Heart Association. "2001 Heart and American Heart Association, Dallas, TX Stroke 2001. Statistical Up- Website observation: http://www.americanheart.org/statistics/. Date visited: June, 2001 [24] "U.S. POPClock Projection." the U.S. Bureau of the Census Website observation: http://www.census.gov/cgi-bin/popclock. Date visited: May, 2001 [25] National Heart, Lung, and Blood Institute. "Congestive heart failure in the United States: A new epidemic." National Institutes of Health, Public Health Service, Data Fact Sheet. Bethesda, MD: U.S. Department of Health and Human Services, 1996 [26] W. J. Long, S. Naimi, M. G. Criscitiello. "Evaluation of a new method for cardiovas- cular reasoning." JAMA (1994) 1:127-141 [27] S. J. Press and K. Shigenmasu. "In Contributions to Probability and Statistics, Essays in honor of Ingram Olkin, Glesser, L. J., et. al. editors, Chapter 15 (Bayesian factor analysis with priors on factor scores, factor loadings, and the error disturbance covariance matrix.) " Springer Verlag (1989) New York [28] S. E. Lee and S. J. Press. "Robustness of Bayesian Factor Analysis Estimates." Communications in Statistics - Theory And Methods, 1998 (27) 8 [29] D. B. Rowe. "Correlated Bayesian Factor Analysis. Ph.D. Thesis" University of California, Riverside, CA 92521 1998 [30] D. Rubin and D. Thayer. "EM Algorithms for ML Factor Analysis." Psychometrika (1982)47, 1:69-76 62 [31] T. W. Anderson and D. Rubin. "Statistical Inference in Factor Analysis." In Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, edited by Jerzy Neyman (1956) 111-150 [32] W. J. Long, H. F., and S. Naimi. "Reasoning requirements for diagnosis of heart disease." Artificial Intelligence in Medicine (1997) 10:5-24 [33] S. Russell and P. Norvig. "Artificial Intelligence: A Modern Approach." Prentice Hall. New Jersey, 1995. [34] D. Heckerman. "Probabilistic Similarity Networks. (1991)" MIT Press, Cambridge, Massachusetts. [35] G. Vouros. "Representing, Adapting and Reasoning with Uncertain, Imprecise, and Vague Information." Expert Systems with Applications (2000) 19: 167-192 [36] K. Dracup, D. W. Baker, S. B. Dunbar, and et al. "Management of Heart Failure II: Counseling, Education, and Lifestyle Modifications." JAMA (1994) 272:1442-1446 [37] American Thoracic Management: A Society. "Dyspnea: consensus Statement." Mechanisms, July, 1998. Assessment, Website and observation: http://www.thoracic.org/statements/. Date visited: June, 2001. [38] U. H. Patel and C. F. Babbs. "A Computer-Based, Automated, Telephonic System to Monitor Patient Progress in the Home Setting." Journal of Medical Systems (1992) 16(312): 101-112 [39] M. Katz, P. J. Gill, and R. B. Newman. "Detection of Preterm Labor by Ambulatory Monitoring of Uterine Activity: A Preliminary Report." Obstet. Gynecol. (1986) 68(6): 773-778 [40] K. D. Chadda, B. A. Harrington, and et al. "The Impact of Transtelephonic Documentation of Arrhythmia on Morbidity and Mortality Rate in Sudden Death Survivors." Am. Heart J. (1986) 112: 1159-1165 [41] R. J. Capone, D. Stablein, and et al. "The Effects of a Transtelephonic Surveillance and Prehospital Emergency Intervention System on the 1-Year Course Following Acute Myocardial Infarction." Am. Heart J. (1988) 116(6):1606-1615 63 [42] J. L. Fleg, P. C. Hinton, and et al. "Physician Utilization of Laboratory Procedures to Monitor Outpatients with Congestive Heart Failure." Arch. Intern. Med. (1989) 149:393-396 [43] M. W. Rich and K. E. Freedland. "Effect of DRG's on Three-Month Readmission Rate of Geriatric Patients with Congestive Heart Failure." Am. J. Publ. Health (1988) 78J(6): 680-682 [44] K. J. Isselbacher, E. Braunwald, et al, eds. "Harrison's Principles of Internal Medi- cine." McGraw-Hill: New York, 1994 [45] P. J. Pearce. "The One Thing Everybody Wants." Speech for Healthy Alternatives 2000: Nova Wellness Institute Jan. 10, 2000. New Spirit Naturals, Inc. Annual Convention. Monte Carlo Hotel, Las Vegas, NV. [46] M. Moon. "Growth in Medicare Spending: What Will Beneficiaries Pay?" The Commonwealth Fund May 1999 [47] K. Davis and S. Raetzman. "Meeting Future Health and Long-Term Care Needs of an Aging Population." The Commonwealth Fund December 1999. [48] Georgetown University Institute for Health Care Research and Policy Analysis. "National Health Expenditures Data From Health Care Financing Administration." Website observation: http://www.hcfa.gov/stats/nhe-oact/tables/ti7.htm. Date visited: February 26, 1999 [49] "Economic Report of the President." Transmitted to the Congress: January 2001 Washington, D.C. [50] Protocol Systems, Inc. "FlexNet and Modem Propaq Patient Monitoring System" Website observation: http://www.protocol.com. Date visited: April, 2001 [51] Agilent Technologies: Healthcare Solutions Group. "Interactive Healthcare Services for CHF" Website observation: http://www.agilent.com/healthcare/ihs.Date visited: April, 2001 [52] Health and Hero Technical Network. Brief: "Health White Hero Paper, 64 iCare 2000" Platform Website Overview observation: http://www.healthhero.com/papers/studies/Meta-AnalysisCHF-Outcomes.pdf Date visited: May, 2001 [53] Health Hero Network. "Health Hero White Papers & Outcomes." Website observation: http://www.healthhero.com/cgi-bin/web30lf9/form-whitepaper.cgi.Date vis- ited: August, 2001 [54] G. C. Fonarow, et al "Impact of Comprehensive Heart Failure Management Program on Hospital Readmission and Functional Status of Patients with Advanced Heart Failure." J. of Am. College of Cardiology 30:3. September 1997 [55] M. W. Rich. "Heart Failure Disease Management: A Critical Review." J. of Cardiac Failure, 1999 5(1):64-75 [56] T. E. Meyer. "Results reported from Heart Failure Wellness Center, Division of Cardiology, University of Massachusetts Memorial Health Center. Following an 18-month, 29-patient clinical investigation" [57] R. H. Friedman, J. E. Stollerman, et al. "The Virtual Visit: Using Telecommunications Technology to Take Care of Patients." JAMA, 1997 4(6):413-425 [58] InforMedix, Inc. "Med eMonitor." http://www.informedix.com. Date visited: May, 2001 [59] AvidCare Corporation. "AvidCare Series 1000 Telemonitoring Information Service." http://www.avidcare.com. Date visited: May, 2001 [60] J. Schieszer. "CHF: Computerized Home Monitoring." Internal Medicine World Report, 1996 11(21): 9 [61] Mini Mitter, Inc. "Physiological Monitoring Equipment" http://www.minimitter.com. Date visited: May, 2001 [62] M.A.M. Rogers, D. Small, D. A. Buchan, et al. "Home Monitoring Service Improves Mean Arterial Pressure in Patients with Essential Hypertension. A Randomized, Controlled Trial." Annals of Internal Medicine, 2001 1024-1032 65 [63] I. Dreyfuss. "Exercise May Improve Heart-Protecting Chemicals." Website observation: http://www.tri-cityherald.com/HEALTH/fitness/fit4.html. Date visited: Au- gust, 2001 [64] W. Wooten. "Choosing Exercise for Better Health." The Physician and Sportsmedi- cine (July, 1996) (24):7 Gerontology Manual: [65] B. Lyndon-Griffith. "Exercise parameters for the elderly." School of Occupational Therapy and Physical Therapy, 1996 Website observahttp://otpt.ups.edu/Gerontological-Resources/GerontologyManual/Lyndon- tion: Griffith-B.html. Date visited: July, 2001 [66] S. L. Gray, J. E. Mahoney, and D. K. Blough. "Medication adherence in elderly patients receiving home health services following hospital discharge." The Annals of pharmacotherapy(May, 2001) 35(5): 539-45 [67] "Pills That Briefing Taken Aren't for Health Work." Can't Reporters (May, 1998) rent Pilgrim Research Healthcare: and Issue observation: Date visited: July, 2001 Research Projects." Teaching An Life: Website 2(5). http://www.cfah.org/BlueWebsite/website2/fol2-5.htm. [68] Harvard of Facts and Training Website http://www.harvardpilgrim.org/providers/research/hphc-research.cfm. "Cur- observation: Date vis- ited: August, 2001 [69] B. Starfield, et al "Primary Care in VA: Primer." Website observation: http://www.va.gov/resdev/prt/pcprim.htm. Date visited: August, 2001 [70] D. M. Angaran. "Telemedicine and telepharmacy: current status and future implica- tions." Am J Health Syst Pharm (1999) 56(14):1405-1426 [71] T. Bodenheimer. "Disease management-promises and pitfalls." N Engl J Med (1999) 340(15):1202-1205 [72] P. Szolovits, et al. "Artificial Intelligence in Medical Diagnosis." Annals of Internal Medicine (1988) 108(1): 80-87 66 [73] W. J. Clancey and Letsinger, R. "Neomycin: reconfiguring a rule-based expert system." Proceedings of the Seventh International Conference on Artificial Intelligence. Los Altos, California (1981) M. Kaufmann Publishers, 829-36 [74] R. Davis. "Expert systems: Where are we? and where do we go from here?" Al Magazine (1982) 1982;3:3-22 [75] "Personal communications (July, 2001): Hung-Tao Chung, M.D." Visitor in Cardiology: Children's Hospital in Boston, Harvard Medical School; Attending Physician, Chong-Gung Children'sHospital, Taipei, Taiwan [76] "Personal communications (January, 2001): Dan J. Carlin, M.D." CEO, WorldClinic http://www.worldclinic.com 67