Translational Data Analytics @ Ohio State Fall Forum October 8, 2015 2 – 5:30 p.m. Thompson Library Campus Reading Room 2:05 – 2:20 p.m. Opening Comments Philip Payne, PhD, FACMI The Ohio State University 2:20 – 3 p.m. Data Science: Machine Learning for the Real World Tony Jebara, PhD Columbia University 3:05 – 3:25 p.m. Advancing Population Health Through TDA@OhioState Randi Foraker, PhD The Ohio State University 3:30 – 4:10 p.m. Hierarchical Materials Informatics Surya Kalidindi, PhD Georgia Institute of Technology TDA@OhioState 4:15 – 4:35 p.m. Computationally Efficient GenomeScale Evolutionary Inference: Creating a Sustainable Future for African Smallholder Farmers Laura Kubatko, PhD The Ohio State University 4:40 – 5:20 p.m. From Analysis to Presentation Robert Kosara, PhD Tableau Software 5:25 – 5:30 p.m. Closing Comments David Manderscheid, PhD The Ohio State University 5:30 – 7 p.m. Reception discovery.osu.edu/TDA A DISCOVERY THEMES INITIATIVE Philip Payne, PhD, FACMI Director, Translational Data Analytics @ Ohio State Professor and Chair, Department of Biomedical Informatics, College of Medicine Associate Director for Data Sciences, Center for Clinical and Translational Science The Ohio State University go.osu.edu/payne Opening Comments Tony Jebara, PhD Associate Professor, Computer Science Columbia University go.osu.edu/jebara Data Science: Machine Learning for the Real World Columbia University launched the Data Science Institute (DSI) in early 2012 to develop technologies that unlock the power of global data to help solve some of society’s most challenging problems. Today, it spans well over 100 faculty members. The institute is home to six centers that foster interdisciplinary collaboration and serve as engines of translational research and education in the data sciences. The centers include Health Informatics, Cybersecurity, Financial Analytics, Smart Cities, New Media, and Foundations of Data Science. The DSI was founded through a grant from the City of New York and is now also supported by federal funding and philanthropic foundations as well as an industrial affiliates program. It offers undergraduate, master’s, and professional degrees in data science and has a thriving entrepreneurship program for seeding startup companies. I will discuss some of the collaborations that have emerged across the university campus thanks to the DSI. In particular, I will focus on a number of vignettes detailing our work in leveraging data science, machine learning, and graphical modeling to analyze financial markets, news streams, neuronal data, power-grids, image data, and social media. Randi Foraker, PhD Assistant Professor Department of Epidemiology, College of Public Health Department of Biomedical Informatics, College of Medicine The Ohio State University go.osu.edu/foraker Advancing Population Health Through TDA@OhioState Improving population health remains a complex challenge. The Ohio State University is uniquely positioned to find solutions to the environmental, social, biological, and clinical barriers to wellness. Translational Data Analytics @ Ohio State operates in a culture of collaboration, with growing faculty expertise in relevant areas, and with the ability to build and strengthen existing external partnerships. In this talk, I will present SPHERE, an interactive health visualization tool, as an example of innovation which grew out of the data analytics tradition at Ohio State. I will discuss aspects of SPHERE that are complementary to TDA members’ expertise, and suggest future directions and opportunities for new and existing faculty to advance population health. Surya Kalidindi, PhD Professor, George W. Woodruff School of Mechanical Engineering Georgia Institute of Technology go.osu.edu/kalidindi Hierarchical Materials Informatics Hierarchical materials informatics focuses on the development of data science algorithms and computationally efficient protocols capable of mining the essential linkages in large multiscale materials datasets (both experimental and modeling), and building robust knowledge systems that can be readily accessed, searched, and shared by the broader community. Given the nature of the challenges faced in the design and manufacture of new advanced materials, this new emerging interdisciplinary field is ideally positioned to produce a major transformation in the current practices. The novel data science tools produced by this emerging field promise to significantly accelerate the design and development of new advanced materials through their increased efficacy in gleaning and blending the disparate knowledge and insights hidden in data accumulated from multiple sources (including both experiments and simulations). I will discuss ongoing research about a specific strategy for data science-enabled development of new/improved materials and illustrate key components of the proposed overall framework with examples. Laura Kubatko, PhD Professor Department of Statistics Department of Evolution, Ecology, and Organismal Biology College of Arts and Sciences The Ohio State University go.osu.edu/kubatko Computationally Efficient Genome-Scale Evolutionary Inference: Creating a Sustainable Future for African Smallholder Farmers More than 700 million people globally rely on cassava as a primary food source, including smallerholder farmers in several countries in East Africa. However, East African cassava is under severe threat from the whitefly (Bemisi tabaci), a small insect that carries two viruses that infect cassava plants. Together, these viruses can result in nearly 100% crop loss, with an estimated cost of $1.25 billion USD annually. Crucial questions in the development of tools to combat the whitefly are how many species of whitefly actually exist, and how diverse genetically these species might be. In this talk, I will describe my contributions toward answering these questions as part of the Cassava Whitefly Project funded by the Bill and Melinda Gates Foundation. In particular, I will show how genome-scale data can be efficiently summarized in a manner that leads to accurate inference of evolutionary relationships among species with low computational cost. This methodology will enable an improved understanding of whitefly genomics that can be translated into new approaches to establishing cassava as a sustainable food source in East Africa. Robert Kosara, PhD Research Scientist Tableau Software go.osu.edu/kosara From Analysis to Presentation The academic visualization field focuses on analysis, often overlooking the presentation and communication of data. While this is slowly changing, a lot more needs to be done: we need to understand how people use visualization to get points across, we need to understand the goals and intents, and we need to develop ways of measuring success. Greater understanding will contribute to generating the solutions to real world problems that Translational Data Analytics @ Ohio State is pursuing, such as promoting healthy communities and feeding the global population. In this talk, I will make the case for more focus on the presentation of data using visualization, and show some work that is pointing in the right direction. David Manderscheid, PhD Lead Dean, Translational Data Analytics @ Ohio State Executive Dean and Vice Provost, College of Arts and Sciences The Ohio State University go.osu.edu/deanmanderscheid Closing Comments TDA@OhioState: Making data work for you. Contact Philip Payne, PhD Director 614-292-4778 payne.341@osu.edu David Mongeau, MBA Program Manager 614-292-1282 mongeau.1@osu.edu discovery.osu.edu/TDA