Harvard-MIT Division of Health Sciences and Technology HST.951J: Medical Decision Support, Fall 2005 Instructors: Professor Lucila Ohno-Machado and Professor Staal Vinterbo 6.873/HST.951 Medical Decision Support Fall 2005 Biomedical Decision Support Lucila Ohno-Machado Staal Vinterbo Pete Szolovits Medical decisions Maximize value: • Prolong life • Increase quality of life • Minimize pain • Minimize cost • Match available resources SMART = Scalable Medical Alert Response Technology HIS SMART CENTRAL Figures by MIT OCW. Patient PDA 1 2 3 Cricket listener Sensors SpO2 ECG Location Equipment Defibrillator Location (via RFID) SMAR T Central Cricket beacons Recognition Location SDM - SpO2 - ECG - Location DSM - Data analysis Caregiver PDA Cricket listener Pattern LSM - priorities - directions Decision Theory • Game theory – Statistics – Operations research • Maximize utility – In many domains, this means maximize $$$ Example of a Decision Problem • College athlete considering knee surgery • Uncertainties: – success in recovering perfect mobility – infection in surgery (if so, needs another surgery and may loose more mobility) – survive surgery Knee Surgery Death 0.05 Death Surgery II Surgery Infection 0.05 Survival 0.95 No infec. 0.95 0.05 Survival 0.95 Full mobility 0.6 Poor mobility Death 0 Death 0 Wheelchair 3 Full mobility 10 Poor mobility 6 Poor mobility 6 0.4 No Surgery Expected Value of Surgery Death 0.05 Death Surgery II Surgery Infection 7.7 0.05 Survival 0.95 No infec. 0.95 0.05 Survival 0.95 Full mobility 0.6 Poor mobility Death 0 Death 0 Wheelchair 3 Full mobility 10 Poor mobility 6 Poor mobility 6 0.4 No Surgery Sensitivity Analysis • Effect of probabilities in the decision 10 Surgery Expected Values 6 No surgery 0 P(Death) Sensitivity Analysis • Effect of probabilities in the decision 10 Surgery Expected Values 6 No surgery 0 P(Full Mobility) Knee Surgery Death 0.05 Death Surgery II Surgery Infection 0.05 Survival 0.95 No infec. 0.95 0.05 Survival 0.95 Full mobility 0.6 Poor mobility Death 0 Death 0 Wheelchair 3 Full mobility 10 Poor mobility 6 Poor mobility 5 0.4 No Surgery Predictive Models Model Building System Evaluation Data Experiment Pattern Discovery Hypothesis Objectives • Build models from existing data – Pattern recognition • Apply model to new data to predict an unknown feature such as: – Diagnosis – Prognosis (outcome) Figures removed due to copyright reasons. Chromosomes Genome Cell Genes T T C G C G DNA Proteins C A T A A A A T A G T T A Genes contain instructions for making proteins G C G C Proteins Proteins act alone or in complexes to perform many cellular functions Figure by MIT OCW. What kind of data? DNA Genome RNA Transcriptome Transcription DNA bases Nucleus Clinical Data Cell membrane mRNA Phenome Chain of amino acids Gene DNA Ribosome Protein Translation Physiology Physiome Protein Figure by MIT OCW. Proteome Coronary Disease 45 ECG Interpretation QRS amplitude R-R interval SV tachycardia QRS duration Ventricular tachycardia AVF lead S-T elevation LV hypertrophy RV hypertrophy Myocardial infarction P-R interval Risk Score of Death from Angioplasty Unadjusted Overall Mortality Rate = 2.1% 3000 60% 53.6% 62% Number of Cases Number of Cases 2500 50% Mortality Risk 2000 40% 1500 30% 21.5% 26% 1000 20% 12.4% 500 0 10% 7.6% 0.4% 0 to 2 1.4% 3 to 4 2.2% 2.9% 5 to 6 7 to 8 Risk Score Category 1.6% 9 to 10 1.3% >10 0% Predicting Individual Outcome in Coronary Intervention Logistic Regression Model Age > 74yrs B2/C Lesion Acute MI Class 3/4 CHF Left main PCI IIb/IIIa Use Stent Use Cardiogenic Shock Unstable Angina Tachycardic Chronic Renal Insuf. Odds Ratio p-value 2.51 2.12 2.06 8.41 5.93 0.57 0.53 7.53 1.70 2.78 2.58 0.02 0.05 0.13 0.00 0.03 0.20 0.12 0.00 0.17 0.04 0.06 Prognostic Risk Score Model beta Risk coefficient Value 0.921 0.752 0.724 2.129 1.779 -0.554 -0.626 2.019 0.531 1.022 0.948 2 1 1 4 3 -1 -1 4 1 2 2 Informed consent "Informed consent and good clinical practice require a discussion of these risks and benefits, but there is very little data on the degree to which patients comprehend the specifics of this information," The researchers found that, of the patients who received angioplasty 42 percent could not identify any risks, and 41 percent could not identify any benefits. For the surgery patients, 45 percent could not identify any risks and 22 percent could not identify any benefits. Furthermore, when asked to quantify the risks of the procedure, 78 percent of the angioplasty and 57 percent of the surgery patients could not. Alexander et al, 52th ACC meeting Overview of this Course Individualized prediction for decision support in medical/biological problems •Theory -- how it works •Practicality -- when to apply •Implementation -- how to apply Pre-Requisites 6034 -- Intro to AI (Machine Learning) basic statistics, including linear regression If needed, we will consider optional refresher recitations: • basic linear algebra (mostly notation) • basic statistical tests • set theory Course Structure • • • • Homeworks, individual (30%) Midterm (30%) Final Project Presentation and write-up – 5 pages plus references, figures, tables on the web (40%) • No final exam Slides available online. Office hours by arrangement. Password protection for posting articles: Username and password. Intro to Decision Theory and Decision Analysis – Optimal classification performance of a model – Cost functions – Individualized decisions • Confidence in predictions • Decision trees Source: DOE Simple Models – Artificial Intelligence • Nearest neighbors • Association rules • Learning from experts – Statistics • Linear regression • Linear discriminant analysis Analysis of Failure Times • Survival analysis • Cox model • Assumptions required for models • Alternatives Supervised Methods I • Logistic Regression – interpretation of coefficients – limitations asymmetry asymmetry • Classification Trees – splitting functions – pruning – forests <2 color color border border A R border border detail detail <2 Y <2 “malig” “malig” detail detail detail detail “benigh” “benigh” Y > 10 “malig” “malig” detail detail “benign” “benign” Supervised Methods II • Neural networks – Regularization – Mixture of experts .6 34 .2 Σ . 4 .5 .1 • Support Vector Machines – VC dimension – Soft margins 2 .2 .3 Σ .7 4 .2 Weights .8 Weights Σ Supervised Methods III • Rule-based approaches – Rough sets – Fuzzy sets Figures removed due to copyright reasons. Unsupervised Learning Clustering – Agglomerative/divisive – Hierarchical/nonhierarchical – K-means, k-medoids – Multidimensional scaling – Visualization Dimensionality Reduction • Pre-processing – Discretization algorithms – Filtering, cleaning • Compression – Principal components analysis – Partial least squares • Variable/Model Selection – Multivariate strategies – Interpretation Stochastic Search • Approximate solution strategies – Greedy – Annealing – Genetic algorithms – Ant colony optimization – Other evolutionary approaches Evaluation • How good is the prediction? • Strategies for evaluation when number of cases is small – Cross-validation – Jackknife – Bootstrap 0.90 0.80 0.70 Sensitivity – Calibration – Discrimination – Bias and variance 1.00 0.60 LR 0.50 Score aNN 0.40 0.30 0.20 0.10 0.00 0.00 0.20 0.40 0.60 1 - Specificity 0.80 1.00 Improving Performance Combining Models/Ensembles • Boosting • Bagging • Stacking Bioinformatics • Phylogenetic trees • Haplotype tagging (SNP patterns) ACTGCAT ACGCTAG ACTCCAA 2 GTTGCAA CCTGCTT Figures by MIT OCW. Sugested General Books • Duda R, Hart P, Stork D. Pattern Classification Wiley Interscience ($103) • Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning Springer ($67) Duda, Richard O., Peter E. Hart, and David G. Stork. Pattern Classification. 2nd ed. New York, NY: Wiley, 2001. ISBN: 0471056693. Hastie, Trevor, Robert Tibshirani, and Jerome Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York, NY: Springer, 2001. ISBN: 0387952845. Decision Analysis Module • Chernoff and Moses Elementary Decision Theory. Dover ($12) • Hunink et al Decision Making in Health and Medicine: Integrating Evidence and Values. ($65) Chernoff, Herman and Lincoln E. Moses. Elementary Decision Theory. New York, NY: Dover Publications, 1986, c1959. ISBN: 0486652181.�� Hunink, M.G. Myriam and et. al. Decision Making in Health and Medicine: Integrating Evidence and Values. Cambridge, UK: Cambridge University Press, 2001. ISBN: 0521770297.