Predictive Analytics in Health Care (MSA 8750-E) Syllabus Instructor: Email: Office: Address: Phone: Office Hrs: Dr. Abhay Nath Mishra amishra@gsu.edu (the best and quickest way to reach me) Room No. 809, Robinson College of Business Institute of Health Administration, 35 Broad Street, Suite 805, Atlanta GA 30303. (404) 413-7638 By appointment Class Schedule: TBD Day of the week; time; classroom location Prerequisites: Three core courses in analytics: MSA 8000, MSA 8050 and MSA 8200 Course Description The health care industry is one of the largest producers of raw data in the United States. Advances in information gathering methods, increasing standardization and the widespread use of information technologies among health care providers, payers and consumers are further fueling the size and variety of datasets collected. Current regulations and business requirements will continue to push health care organizations to collect and analyze even more data. With the current availability and further creation of a large amount of raw data at different levels (e.g., patient, facility, hospital, health system, physician, disease condition, etc.) and of different variety (structured and unstructured), health care organizations need tools that allow them to effectively sift through these enormous datasets and extract actionable information and knowledge to make smart businesses decisions. It is essential in this context to understand how to model, using advanced analytical methods, complex business problems faced by the healthcare industry and to solve these problems using available data. Furthermore, it is vitally important to use advanced analytical techniques to make reasoned predictions about future events and to take preemptive actions. Predictive modeling is the process of developing models to better predict future outcomes for an event of interest by exploring its relationships with explanatory variables from historical data. A large number of methods with roots in statistics, informational retrieval and econometrics have been developed to extract knowledge from large data sets. These methods can be applied successfully in diverse areas, such as health care market basket analysis, churn analysis for hospitals and insurance companies, health insurance fraud detection, readmission assessment, personalization of treatment regimen, patient risk management and performance-based payment analysis. The course introduces the techniques of predictive modeling and analytics in a data‐rich health care business environment. It covers the process of formulating business objectives, data selection, preparation, and partition to successfully design, build, evaluate and implement predictive models for a variety of health care applications. Predictive modeling tools such as classification and decision trees, neural networks, regressions, association analysis, cluster analysis, etc. will be discussed in detail and applied to practical health care problems. The focus of this course will be on the rigor of the analytical techniques, as also their implementation. Students will be expected to have a background in statistical and quantitative approaches. We will also spend time on the interpretation of results, but the focus will clearly be on the application of tools and techniques. In other words, the course will focus more on the creation and less on the consumption of analytics. Course Objectives: Use retrieval and manipulation tools for health information extraction and reporting Describe different methods of predictive analytics and their application in health care Rigorously apply analytics techniques on healthcare problems Be proficient with analytics software for health care data preparation and analysis Apply predictive analytics tools to make better operational, financial and clinical decisions Create analytics for the consumption of top-level managers Course Materials: Required text: Data Mining for Business Intelligence, 2nd Edition, by Galit Shmueli, Nitin R. Patel, and Peter C. Bruce (Wiley: 2010). (New copy comes with a valid license to XLMiner.) ISBN-10: 0470526823; ISBN-13: 978-0470526828 Data-Driven Healthcare: How Analytics and BI are Transforming the Industry, 1st Edition, by Laura B. Madsen (Wiles and SAS Business Series: 2014). ISBN10: 1118772210; ISBN-13: 978-1118772218 The texts will be supplemented with additional readings. Readings will be posted on Desire2Learn. Students are also encouraged to follow interesting developments in health analytics and report them in the class. The instructor will provide the HCUP and AHA datasets on which the course will be based. Course Conduct and Policies: This course uses a blend of lecture/discussion and assignments. It is absolutely essential for you to come fully prepared to the class. All assignments are due at the beginning of the class. We will not take attendance, but we expect you to be present in every class. Absenteeism or lack of preparation will adversely affect your grade. If you must miss a class, let us know as soon as possible. As a sign of courtesy to us and your fellow classmates, please don’t browse the web, check your email or facebook or twitter, or use your cell phone in any non-academic manner while you are in the class. Finally, follow the policies related to academic integrity (more on that later!). Grading: Your final grade in this class will depend on four components. The distribution is as follows: Group Project Group Assignments Exams (5) (2) 30 % (group effort) 20 % (group effort) 50 % (individual effort) Group Assignments Students will be expected to complete 5 homework assignments over the course of the semester. These assignments are designed to reinforce your understanding of the topics covered. Assignments must be turned in at the beginning of or before the class period of the due date. No late work is accepted. The instructor will create groups of 2-3 students. Each group is expected to complete assignments independently. Assignments should be submitted via email at amishra@gsu.edu to Professor Mishra. Additionally, you are required to bring a printed copy of the assignment to the class. Group Project The purpose of the group project is to encourage students to apply (and expand on) their learning in the class in an area that is of special interest to them. I strongly suggest you begin working on this project from the first week and delegate tasks amongst your group members in an efficient manner. The students will leverage the two datasets – HCUP and AHA – and provide data-driven insights based on the tools we cover in the class. These analyses conducted and insights should build on those we cover during the regular class. It is important that students be deeply immersed in the two datasets mentioned above, and potentially other datasets, and learn how to combine different datasets and analyze them using analytics tools. Submit your paper via email at amishra@gsu.edu to Professor Mishra. Additionally, you are required to bring a printed copy of the paper to the class. Exam and Exam Policies: There will be two exams. The exams will cover the materials discussed in the course. 1. You are not allowed to discuss the exams with anyone. You (and your collaborator if he/she is a fellow student) will receive a score of 0 if any infraction is noticed and established. In addition, other actions may also be taken. 2. Exam missed due to an excused absence must be made up within one week of returning to class for full credit or no credit will be given. Exam missed due to an unexcused absence may not be made up. Documentation proving the excused absence may be required at the time the exam is made up. 3. The first exam will be held in the class. The second exam will be held during the exam week. Submit your exam electronically at amishra@gsu.edu to Professor Mishra. Exams will be based on datasets that students will have worked on before. Academic Dishonesty Policy Cheating on an examination or assignment, or assisting another student in cheating, is not permissible. You may not discuss exam questions or case write-up issues with anyone. If you have any questions, please see the instructor. Further, students are expected to abide by the Georgia State University code. On each exam or assignment you will be asked to write out and sign the following pledge. "I pledge on my honor that I have not given or received any unauthorized assistance on this exam/assignment." Special Needs If you have a disability and/or special needs, you should bring this to my attention as soon as possible, but not later than the second week of class. Course Feedback Your constructive assessment of this course plays an indispensable role in shaping education at Georgia State. Upon completing the course, please take the time to fill out the online course evaluation. Final Grades Georgia State University has implemented a plus and minus grading method. The suggested cutoffs for various grades are: 96-100 = A+; 93-95 = A; 90-92 = A-; 87-89 = B+; 83-86 = B 80-82 = B-; 77-79 = C+; 73-76 = C; 70-72 = C-; 60-69 = D <60 = F The instructor reserves the right to modify these cut-offs. Tentative Class Schedule (adjustments may be necessary) Date Topic Class 1 Introduction to Predictive Modeling in Competing on Health Care, Software Setup Analytics; Chapters 1,2 from SPB and Chapter 1 from M. Health Care Data Extraction and Online reading Manipulation, SQL, Relational data model material + notes and RDBMS Data Extraction and Manipulation of Online reading Unstructured Health Care Data material + notes OLAP and Multidimensional Analysis An Introduction Data Quality to OLAP Summarization and Data cubes Multidimensional Terminology and Technology; Chapters 3 and 4 from M Data Exploration, Visualization and Chapters 3, 4 Dimension Reduction from SPB; Chapter 7 from M Association and Health Market-based Chapter 13 from analysis SPB; Cluster Analysis Chapter 14 from SPB; Classification and Predictive Modeling Chapters 5, 9 from SPB; Predictive Modeling using Regression Chapters 6 and 10 from SPB; Predictive Modeling using Naïve Bayes Chapters 8 and 9 and Regression Trees from SPB; Time Series and Smoothing Chapters 15 and 17 from SPB; Predictive Modeling of Disease Trends Chapter 11 from using Neural Networks SPB; Text Mining in Health Care Online reading material + notes Class 2 Class 3 Class 4 Class 5 Class 6 Class 7 Class 8 Class 9 Class 10 Class 11 Class 12 Class 13 Class 14 Reading Assignment Due Assignment 1 Assignment 2 Assignment 3 Assignment 4 Assignment 5 Project Presentation Project report due (3 days after the last class day) Take-home Final Exam (during the final exam time) All the best!!