New Challenges for Modelers of Infectious Diseases of Africa Fred Roberts, DIMACS 1 Mathematical modeling of the spread of infectious disease has a long history. Bernoulli’s 1760 modeling of smallpox. 2 Endemic and emerging diseases of Africa provide new and complex challenges for mathematical modeling. HIV/AIDS Malaria 3 Tuberculosis Major new health threats such as avian influenza present especially complex challenges to modelers in the context of developing countries. 4 This workshop is aimed at: •Studying challenges for mathematical models arising from the diseases of Africa •Understanding special challenges from diseases in resource-poor countries. •Bringing together U.S. and African researchers and students to collaborate in solving these problems. •Laying the groundwork for future collaborations to address problems of public health and disease in Africa. 5 What are the challenges for mathematical scientists in the defense against disease? This question led DIMACS, the Center for Discrete Mathematics and Theoretical Computer Science, based at Rutgers University, to launch a “special focus” on this topic in Spring 2002. 6 DIMACS Special Focus on Computational and Mathematical Epidemiology Special Focus: • Workshops • Tutorials • Research working groups • Visitor Exchanges 7 DIMACS Special Focus on Computational and Mathematical Epidemiology One workshop was instrumental in leading to the present one: “Evolutionary Aspects of Vaccine Use” DIMACS, June 2005 Organizers: Troy Day, Alison Galvani, Abba Gumel, Claudio Struchiner 8 DIMACS Special Focus on Computational and Mathematical Epidemiology The special problems of vaccination strategies in Africa that arose in this workshop were one of the primary motivations for Abba Gumel to propose that DIMACS sponsor a workshop on mathematical modeling of infectious diseases of Africa. 9 Snowbird Conference •“Modeling the Dynamics of Human Diseases: Emerging Paradigms and Challenges” •July 2005, Snowbird, Utah •Organizers: Carlos Castillo-Chavez, Dominic Clemence, Abba Gumel, Travette Jackson, Ronald Mickens •One notable feature of conference: Recognition of central role of developing nations in emergence 10 of novel pathogens. Snowbird Conference •One notable feature of conference: Recognition of central role of developing nations in emergence of novel pathogens. •This led meeting participant Simon Levin to suggest that we pursue ways to more directly engage US and African researchers and students. 11 The Role of Mathematical Modeling Hundreds of math. models since Bernoulli’s work on smallpox have: •highlighted concepts like core population in STD’s; 12 •Made explicit concepts such as herd immunity for vaccination policies; 13 •Led to insights about drug resistance, rate of spread of infection, epidemic trends, effects of different kinds of treatments. 14 •In recent years, modeling has had an increasing influence on the theory and practice of disease management and control. •Modeling has played an important role in shaping public health policy decisions in a number of countries. – Gonorrhea, HIV/AIDS, BSE, FMD, measles, rubella, pertussis (UK, US, Netherlands, Canada) measles 15 FMD •Modeling has provided insights leading to “optimal” treatment strategies – Immuno-pathogenesis of HIV/AIDS and use of highly active anti-retroviral therapy •Modeling has played a role in shaping vaccine design and determining threshold coverage levels for vaccine-preventable diseases: AIDS – measles, rubella, polio 16 During SARS outbreaks in 2003, modelers and public health officials worked hand-in-hand to devise effective control strategies in a number of countries. Earlier, similar importance of efforts to control FMD. 17 The size and overwhelming complexity of modern epidemiological problems calls for new approaches. New methods are needed for dealing with: •dynamics of multiple interacting strains of viruses through construction and simulation of dynamic models; •spatial spread of disease through pattern analysis and simulation; •early detection of emerging diseases or bioterrorist acts through rapidly-responding surveillance systems. 18 •To maximize benefit from mathematical models, need to: – specialize them – test assumptions in specific contexts and populations – gather local data to help define key parameters •That is one of the motivations for this workshop and the plans we have for follow ups. 19 •If scientists from Africa and outside Africa collaborate: – Vitally-needed access to data can be provided – Data can be interpreted with the help of individuals knowledgeable about local conditions – Better and more realistic models can be developed. •It is important for non-African researchers to: – Understand effects of government policies in Africa – Learn of modeling efforts in Africa – Find key contacts knowledgeable about both endemic diseases and deadly emerging diseases 20 Themes of our Meeting Current State of Infectious Diseases in Africa •Current state of different diseases. •Epidemiological data. •Recent control initiatives: failures and successes 21 Themes of our Meeting Mathematical Modeling of Diseases that Inflict a Significant Burden on Africa •HIV/AIDS •TB •Malaria •Diseases of Animals AIDS orphans, Zambia 22 Themes of our Meeting Mathematical Modeling of Diseases that Inflict a Significant Burden on Africa •HIV/AIDS – Modeling/evaluation of preventive and therapeutic strategies – Allocation of anti-retroviral drugs – Evolution and transmission of drugresistant strains – Interaction with other infections: TB, malaria 23 Themes of our Meeting Mathematical Modeling of Diseases that Inflict a Significant Burden on Africa •Malaria – New methods of control (e.g., insecticide-treated cattle) – Climate and disease (e.g., global warming and effect on mosquito populations) 24 Themes of our Meeting Mathematical Modeling of Diseases that Inflict a Significant Burden on Africa •Diseases of Animals – Bovine tuberculosis (in domestic and wild populations) – Avian influenza – Trypanosomiasis 25 Themes of our Meeting Modeling Issues from Threat of Emerging Diseases in Resource-poor Countries •Special issues arising from: – Slow communication – Short supplies of vaccines and prophylactics – Difficulty of imposing quarantines – Special emphasis on problems arising from avian or pandemic influenza 26 Themes of our Meeting Optimization of Scarce Public Health Resources •How to handle shortages of drugs and vaccines, physical facilities, and trained personnel. •Mathematical methods to: – Allocate medicines to optimize impact – Assign trained personnel to most critical jobs – Design efficient transportation plans. – Design efficient dispensing plans. 27 Themes of our Meeting Vaccination Strategies •Explore protocols for vaccination • for major diseases in Africa •Discuss potential for vaccines for HIV, malaria •Use of computer simulations to allow comparison of vaccination strategies when field trials are prohibitively expensive •Identify major modeling challenges unique to Africa: e.g., age-structured, health-status-related models 28 Themes of our Meeting Next Steps •Identify future research challenges for African and non-African scientists in collaboration •Identify training programs for African and nonAfrican students •Identify future initiatives 29 Methods of Mathematical Epidemiology •Many mathematical tools used in epidemiological modeling. •Not so widely known: Usefulness of newer tools of discrete mathematics and algorithmic methods of theoretical computer science. 30 Statistical Methods •Long used in epidemiology. •Used to evaluate role of chance and confounding associations. •Used to ferret out sources of systematic error in observations. •Role of statistical methods is changing due to the increasingly huge data sets involved, calling for new approaches. 31 Dynamical Systems 32 Dynamical Systems •Used for modeling host-pathogen systems, phase transitions when a disease becomes epidemic, etc. •Use difference and differential equations. •Need for new methods to apply today’s powerful computational tools to these dynamical systems. 33 Probabilistic Methods •Important role of stochastic processes, random walk models, percolation theory, Markov chain Monte Carlo methods. 34 Probabilistic Methods Continued •Computational methods for simulating stochastic processes in complex spatial environments or on large networks have started to enable us to simulate more and more complex biological interactions. 35 Discrete Math. and Theoretical Computer Science • Many fields of science, in particular molecular biology, have made extensive use of DM broadly defined. 36 Discrete Math. and Theoretical Computer Science Cont’d •Especially useful have been those tools that make use of the algorithms, models, and concepts of TCS. •These tools remain largely unused and unknown in epidemiology and even mathematical 37 epidemiology. DM and TCS Continued •These tools are made especially relevant to epidemiology because of: –Geographic Information Systems 38 DM and TCS Continued –Availability of large and disparate computerized databases on subjects relating to disease and the relevance of modern methods of data mining. 39 DM and TCS Continued –The increasing importance of an evolutionary point of view in epidemiology and the relevance of DM/TCS methods of phylogenetic tree reconstruction. 40 Challenges for Discrete Math and Theoretical Computer Science 41 What are DM and TCS? DM deals with: •arrangements •designs •codes •patterns •schedules •assignments 42 TCS deals with the theory of computer algorithms. During the first 30-40 years of the computer age, TCS, aided by powerful mathematical methods, especially DM, probability, and logic, had a direct impact on technology, by developing models, data structures, algorithms, and lower bounds that are now at the core of computing. 43 DM and TCS have found extensive use in many areas of science and public policy, for example in Molecular Biology. These tools, which seem especially relevant to problems of epidemiology, are not well known to 44 those working on public health problems. So How are DM/TCS Relevant to the Fight Against Disease? 45 Detection/Surveillance Streaming Data Analysis: •When you only have one shot at the data •Widely used to detect trends and sound alarms in applications in telecommunications and finance •AT&T uses this to detect fraudulent use of credit cards or impending billing defaults •Columbia has developed methods for detecting fraudulent behavior in financial systems •Uses algorithms based in TCS •Needs modification to apply to disease detection 46 •DIMACS/CDC Adverse Events Detection Group Research Issues: •Modify methods of data collection, transmission, processing, and visualization •Explore use of decision trees, vector-space methods, Bayesian and neural nets •How are the results of monitoring systems best reported and visualized? •To what extent can they incur fast and safe automated responses? •How are relevant queries best expressed, giving the user sufficient power while implicitly restraining him/her from incurring unwanted 47 computational overhead? Cluster Analysis •Used to extract patterns from complex data •Application of traditional clustering algorithms hindered by extreme heterogeneity of the data •Newer clustering methods based on TCS for clustering heterogeneous data need to be modified for infectious disease applications. 48 Visualization •Large data sets are sometimes best understood by visualizing them. 49 Visualization •Sheer data sizes require new visualization regimes, which require suitable external memory data structures to reorganize tabular data to facilitate access, usage, and analysis. •Visualization algorithms become harder when data arises from various sources and each source contains only partial information. 50 Data Cleaning •Disease detection problem: Very “dirty” data: 51 Data Cleaning •Very “dirty” data due to –manual entry –lack of uniform standards for content and formats –data duplication –measurement errors •TCS-based methods of data cleaning –duplicate removal –“merge purge” –automated detection 52 Dealing with “Natural Language” Reports •Devise effective methods for translating natural language input into formats suitable for analysis. •Develop computationally efficient methods to provide automated responses consisting of followup questions. •Develop semi-automatic systems to generate queries based on dynamically changing data. 53 Social Networks •Diseases are often spread through social contact. •Contact information is often key in controlling an epidemic, man-made or otherwise. •There is a long history of the use of DM tools in the study of social networks: Social networks as graphs. 54 Research Issues •Dynamically changing networks •Making use of other information about networks: semantic graphs •Handling large-scale networks •Approximations of parameters (such as infectivity, susceptibility, latent period) that are not well specified •Making use of analogous lines of research such as spread of opinions through social networks. 55 Evolution 56 Evolution •Models of evolution might shed light on new strains of infectious agents. •New methods of phylogenetic tree reconstruction owe a significant amount to modern methods of DM/TCS. • Phylogenetic analysis might help in identification of the source of an infectious agent. 57 Some Relevant Tools of DM/TCS •Information-theoretic bounds on tree reconstruction methods. •Optimal tree refinement methods. •Disk-covering methods. •Maximum parsimony heuristics. •Nearest-neighbor-joining methods. •Hybrid methods. •Methods for finding consensus phylogenies. 58 New Challenges for DM/TCS •Tailoring phylogenetic methods to describe the idiosyncracies of viral evolution -- going beyond a binary tree with a small number of contemporaneous species appearing as leaves. •Dealing with trees of thousands of vertices, many of high degree. •Making use of data about species at internal vertices (e.g., when data comes from serial sampling of patients). •Network representations of evolutionary history 59 if recombination has taken place. New Challenges for DM/TCS: Continued •Modeling viral evolution by a collection of trees -to recognize the “quasispecies” nature of viruses. •Devising fast methods to average the quantities of interest over all likely trees. 60 Decision Making/Policy Analysis 61 Decision Making/Policy Analysis •DM/TCS have a close historical connection with mathematical modeling for decision making and policy making. •Mathematical models can help us: –understand fundamental processes –compare alternative policies and interventions –provide a guide for scenario development –guide risk assessment –aid forensic analysis –predict future trends 62 Consensus •DM/TCS fundamental to theory of group decision making/consensus •Based on fundamental ideas in theory of “voting” and “social choice” •Key problem: combine expert judgments (e.g., rankings of alternatives) to make policy 63 Consensus Continued •Prior application to biology (Bioconsensus): –Find common pattern in library of molecular sequences –Find consensus phylogeny given alternative phylogenies •Developing algorithmic view in consensus theory: fast algorithms for finding the consensus policy •Special challenge re epidemiology: instead of many “decision makers” and few “candidates,” could be few decision makers and many candidates (lots of different parameters to modify) 64 Decision Science •Formalizing utilities and costs/benefits. •Formalizing uncertainty and risk. •DM/TCS aid in formalizing optimization problems and solving them: maximizing utility, minimizing pain, … •Bringing in DM-based theory of meaningful statements and meaningful statistics. •Some of these ideas virtually unknown in public health applications. •Challenges are primarily to apply existing tools to 65 new applications. Game Theory 66 Game Theory •History of use in military decision making •Relevant to conflicts: bioterrorism •DM/TCS especially relevant to multi-person games •Of use in allocating scarce resources to different players or different components of a comprehensive policy. •New algorithmic point of view in game theory: finding efficient procedures for computing the winner or the appropriate resource allocation. 67 Operations Research •History of use in wide variety of practical applications. •Issues of fair allocation of limited resources. •Transportation schedules •Inventory planning •Assignment of workers to jobs/locations •Finding locations for clinics, hospitals, medicine dispensing stations 68 Combinatorial Group Testing •Natural or human-induced epidemics might require us to test samples from large populations at once. •Combinatorial group testing arose from need for mathematical methods to test millions of WWII draftees for syphilis. •Identify all positive cases in large population by: –dividing items into subsets –testing if subset has at least one positive item –iterating by dividing into smaller groups. 69 Dialogues on Mathematical Epidemiology: Infectious Diseases of Africa •Let us use this meeting to: – Survey new methods and discuss new approaches. – Open up new lines of communication. – Lay the groundwork for future collaborations 70 THANK YOU 71