The Biomedical Data Driven Discovery (BD3) Training Program at Northwestern University (NU) is a collaborative proposal that brings together Big Data scientists and educators from the Feinberg School of Medicine (FSM), the McCormick School of Engineering and Applied Science (MEAS), the Weinberg College of Arts and Sciences (WCAS) and the School of Communication. The proposal leverages existing dataintensive doctoral programs: the well-established and nationally recognized program in Data Analytics in MEAS, led by Diego Klabjan, and the innovative and growing programs in Health and Biomedical Informatics (HBMI), led by Justin Starren. Since long before this RFA was proposed, Drs. Klabjan and Starren have been working to more closely align the two training programs and increase the opportunities for Big Data training in the biomedical domain. This proposal represents the culmination of those extensive discussions. It brings together the biomedical Big Data domain expertise with methodological expertise in computation, informatics, statistics, and mathematics, from schools and departments across NU. Before long, we believe that nearly every biomedical researcher will need to utilize Big Data tools. The BD3 is not targeted at these “tool users. Rather, the goal of BD3 is to train the next generation biomedical Big Data tool builders. Coursework The goal of BD3 is to create a truly multi-disciplinary data science training environment. In doing so, BD3 will encompass multiple departments and degree-programs, designed for students coming from a wide variety of backgrounds and undergraduate degrees. Thus, the role of the BD3 curriculum is four fold: 1) it provides guidance for students who are considering data science, whether they plan on applying to BD3 or not; 2) it provides guidance for the executive committee on evaluating the preparation of applicants; 3) it provides guidance to trainees and mentors on the design of individual training plans, in particular which courses count toward different requirements; and, 4) it provides guidance on areas where new courses are needed. Success in data science requires mastery of three distinct skill sets: 1) an understanding of the target domain, 2) an understanding of the nature and structure of the data within that domain, and 3) a master of the computational and statistical techniques for manipulating and analyzing the data. This translates into a number of more specific competencies. Those are divided into three broad categories: Core Requirements, those courses that every BD3 student is expected to complete; Selectives, competencies that must be addressed, but can be fulfilled by any of a selection of courses; and, Electives, courses that may strengthen a students preparation for a particular project but are not tied to a specific competency requirement. Unlike many training programs, many of the selectives for BD3 are in the first year and the requirements are predominately in the second year. This structure facilitates the integration of students from both HBMI and DA by allowing them to focus on degree requirements during year one that also satisfy BD3 selective requirements in preparation for Year 2 requirements. Core Requirements MSIA 420 --Predictive Analytics. This is one of the three core courses that all BD3 students are expect to take. It includes classifiers, non-parametric regression, time series, neural networks, and Bayesian networks. MSIA 421 --Data Mining. This is the second core course. This includes clustering, association rules, factor analysis, scale development, principle component analysis and dimension reduction. MSIA 431 --Analytics for Big Data. This is the third core course. It focuses on Hadoop, MapReduce and other methods specifically optimized for Big Data. Responsible Conduct of Research. Biomedical Big Data research often involves sensitive data and real- world clinical systems. For example, here is a hypothetical course plan for BD3 as well as a comparative list of coursework for trainees from the three target degree programs: BD3 Competency HISP Informatics Student DGP Informatics Student IEMS Analytics Students Predictive Analytics Data Mining Big Data Methods Responsible Conduct of Research Domain Selective Domain Selective Statistics Programming – Java Programming – Python Ontologies Databases Text Analytics MSIA 420 Predictive Analytics MSIA 421 Data Mining MSIA 431 Analytics for Big Data Responsible Conduct of Research IGP 410 Molecular Biology &Genetics IBIS 407 Genome Scale Science EPI_BIO 301 Intro. TO Biostatistics MSIA 490-25 Programming Java MSIA 490-20 Analytics with Python HSIP 441 HBMI Methods 1 HSIP 442 HBMI Method 2 [covered in HSIP 442 above] MSIA 420 Predictive Analytics MSIA 421 Data Mining MSIA 431 Analytics for Big Data Integrity in Biomedical Research IGP 410 Molecular Biology & Genetics IBIS 407 Genome Scale Science EPI_BIO 301 Intro. TO Biostatistics MSIA 490-25 Programming Java MSIA 490-20 Analytics with Python HSIP 441 HBMI Methods 1 HSIP 442 HBMI Method 2 [covered in HSIP 442 above] MSIA 420 Predictive Analytics MSIA 421 Data Mining MSIA 431 Analytics for Big Data Responsible Conduct of Research IGP 410 Molecular Biology & Genetics IBIS 407 Genome Scale Science IEMA 401 Intermediate Statistics MSIA 490-25 Programming Java MSIA 490-20 Analytics with Python HSIP 441 HBMI Methods 1 MSIA 490 Introduction to Databases MSIA 490 Text Analytics Other Degree Requirements HSIP 440 Into to Medical Informatics PH 445 Writing for Publication HSIP 400 HSIP Colloquium HSIP 401 Intro to Measurement Science HSIP 440 Into to Medical Informatics IGP 405 Cell Biology IGP 425 Topics in Drug Discovery IEMS 450-1 Mathematical Programming I IEMS 460-1 Stochastic Models IEMS 480-1 Production/Logistics I IEMS 480-2 Production/Logistics II IEMS 435-1 Simulation IEMS 450-1 Mathematical Programming II IEMS 460.2 Stochastic Models II Program Faculty The vision of BD3 is to combine deep domain knowledge with methodological expertise. This involved recruiting a multidisciplinary group of Primary Mentors as well as secondary mentors. Rather than include all Big Data researchers at NU as potential mentors, we have focused on those who have expressed a high degree of interest in the program and who have particular expertise that will be important for a wide variety of trainees. From among the many eligible faculty, we have carefully selected mentors with expertise across a wide range of methodological skills as well as across a wide range of biomedical Big Data types, selecting only those with both the interest and time to actively participate in
BD3. Primary Mentors Luis Amaral Sangeeta Bhorade Chemical and Biomedical Engineering, and Medicine Medicine-Pulmonary and Critical Care Larry Birnbaum Rosemary Braun Electrical Engineering and Computer Science Preventive Medicine David Cella Rex L. Chisholm Nosh Contractor Medical Social Sciences and HBMI Cell and Molecular Biology and Surgery Behavioral Sciences, IEMS, and School of Communications Ramana Davuluri Philip Greenland Preventive Medicine-HBMI Preventive Medicine and Medicine-Cardiology M. Geoffrey Hayes Jane Holl Hongmei Jiang Medicine-Endocrinology, Metabolism and Molecular Medicine, and Anthropology Pediatrics and Preventive Medicine Statistics Neil Jordan Aggelos K. Katsaggelos Preventive Medicine, Psychiatry and Behavioral Sciences Electrical Engineering and Mechanical Engineering Neil Kelleher Abel Kho Chemistry, Molecular Biosciences, and Medicine Medicine and Preventive Medicine-HBMI Diego Klabjan Konrad Kording David Liebovitz IEMS, co-director of BD3 Physical Medicine and Rehabilitation, and Physiology Medicine and Preventive Medicine-HBMI Lei Liu Sanjay Mehrotra Preventive Medicine-Biostatistics IEMS and Medical Social Sciences Lee Miller Eric Perreault Mark Seagraves Physiology Biomedical Engineering and Physical Medicine and Rehabilitation Neurobiology Justin Starren D. James Surmeier Preventive Medicine-HBMI, co-director of BD3 Physiology Krisitn Swanson Neurological Surgery Secondary Mentors Kristi Holmes Lifang Hou Siddhartha Jonnalagadda Elizabeth McNally Richard Neapolitan Nicholas Soulakis Preventive Medicine-HBMI Preventive Medicine Preventive Medicine-HBMI Genomic Medicine Preventive Medicine-HBMI Preventive Medicine-HBMI