Technology, Data, Analytics New possibilities in our lives -The important role of tomorrow’s mathematics professionals Lilian Wu, Worldwide University Programs Executive, IBM PSM workshop -- October 14, 2011 © 2011 IBM Corporation Everything is becoming Analytics & Optimization INSTRUMENTED INTERCONNECTED INTELLIGENT We now have the ability to measure, sense and see the exact condition of practically everything. People, systems and objects can communicate and interact with each other in entirely new ways. We can respond to changes quickly and accurately, and get better results by predicting and optimizing for future events. IT MANUFACTURING WORKFORCE 2 SUPPLY CHAIN CUSTOMERS TRANSPORTATION FACILITIES © 2011 IBM Corporation Massive amounts of data being captured on natural and man-made engineered structures, processes and systems Volume of Digital Data Every day, 15 petabytes of new information are being generated. This is 8x more than the information in all U.S. libraries. By 2010, the codified information base of the world is expected to double every 11 hours. Importance of Decision Making 70% of executives believe that poor decision making has had a degrading impact on their companies’ performance Only 9% of CFOs believe they excel at interpreting data for senior management Analytics, modeling, and visualization of these data can help to run our systems more effectively 3 © 2011 IBM Corporation Types of Analytics Degree of Complexity Stochastic Optimization How can we achieve the best outcome including the effects of variability? Optimization How can we achieve the best outcome? Predictive modeling What will happen next if ? Simulation What could happen … ? Prescriptive Predictive Forecasting What if these trends continue? Alerts What actions are needed? Query/drill down What exactly is the problem? Ad hoc reporting How many, how often, where? Standard Reporting What happened? Descriptive Based on: Competing on Analytics, Davenport and Harris, 2007 4 © 2011 IBM Corporation Analytics Skills Areas that IBM and Clients Need 5 Understanding the types of analytics Database design Data collection & mining (finding, cleansing, normalizing) Database systems (design, implementation, on-line analytics of data) Rules-based data integration and reduction Stream computing and computing for multiple, parallel processing Statistical analysis Predictive analytics (modeling, simulation, forecasting) Prescriptive analytics (optimization) Descriptive analytics (score cards, dashboards, alerts) Analytics in -- marketing, text, web, risk, transportation, energy, etc. Risks, privacy, security, legal Implications Project management Inference and decision making Applying analytics to real world problems Global University Programs © 2011 IBM Corporation Vassar Brothers Medical Center Technology changes how systems and processes work – need mathematical models to better understand the changes and their consequences – 365 bed Regional Hospital in Poughkeepsie, NY – Four Centers of Clinical Excellence • • • • The Heart Institute Women’s & Children’ Services The Dyson Center for Cancer Care Center for Advanced Surgery – Nurses • 700 – Physicians • 520 privileges – Campus • 515,000 Sq. Ft. Multiple Structures Ranges - 10 to 100 years – Freestanding Ambulatory Center • 130,000 Sq. Ft • 15 miles south 6 © 2011 IBM Corporation Hospitals are Complex Systems Patients Hospitals Doctors Nurses Staff Administrators Family members Employers Insurers Governments “Modern medicine is one of those incredible works of reason: an elaborate system of specialized knowledge, technical procedures, and rules of behavior.” – 7 Paul Starr, author of The Social Transformation of American Medicine © 2011 IBM Corporation OR techniques -- model and analyze new processes RFID tags to track IV pumps – – – Improve Asset Utilization – IV pumps – nurses typically spent over an hour each day looking for equipment – resulted in pumps being hoarded • No being properly cleaned • Not certified to be pumping the correct amounts Changed to tagging each pump with an RFID to track location. Twice daily equipment census and pick up unused pumps collected from central locations Reduce over-buying, lost assets Workflow Optimization – Match equipment/people to need Results – – – 8 Reduce time staff spends looking for missing devices Pumps cleaned and certified to be pumping the correct amounts Planned purchase of $0.5M of new pumps – not necessary © 2011 IBM Corporation OR -- city planning and management using geographical data DC Water & Sewer Authority -- Automated scheduling in a user selected zone of the city User selected region for scheduling 9 Goal: •Number of Crews = 2 •Shifts: 1 day shift per crew •Objective: Assign as many WO’s as possible to each crew, while maximizing the sum of the priority of the WO’s while meeting constraints of shift duration, © 2011 IBM Corporation lunch break & travel time. Statistical Models -- city operations using historical data Buildings: Reduce energy use and reduce greenhouse gas emissions 1,400 K-12 Public School Buildings in New York City 150 million sq ft – Joint project w. CUNY – Static Data (5 years energy consumption, building characteristics, weather) – Statistical analysis, monitoring, simulation, optimization of energy use, GHG emissions and retrofit planning with budget constraints • Technical Challenges – Processing large volumes of historical data from various sources – Developing physics-based models and statistical models for energy consumption – Simulation of energy demand, energy supply, and building operations to reduce energy consumption, cost and GHG emissions Results for March 2011 – Martin Luther King Junior High School reduced its electricity consumption by 35.1% and 216,061 pounds of CO2 – The top 10 winning schools collectively saved 327,003 pounds of CO2 and average reduction of 16% in electricity consumption 10 © 2011 IBM Corporation 11 © 2011 IBM Corporation Water -- In the last 100 years global water usage has increased at twice the rate of population growth Produces table grape, pepper, stone fruit and citrus varieties on 12,000 acres in California •Analyzed different irrigation systems (incl. newer drip systems) impact on crop yields -- decreased water usage by 8.5% since 2006 •Better matching of farming equipment to specific harvesting tasks -- decreased fuel consumption by 20% since 2006 12 © 2011 IBM Corporation Monitoring and data collection Natural Water System Management for Galway Bay (Ireland) Marine research infrastructure of sensors and computational technology interconnected across Galway Bay collecting and distributing information on: – coastal conditions – pollution levels – marine life Streaming real-time information to allow better decision-support related to: – Weather threats – Pollution alerts – Algal bloom prediction – Rogue waves, etc The monitoring services, delivered via the web and other devices – used for tourism, fishing, aquaculture and environment Adapted from Smart Bay reference documentation 13 See video at http://www.youtube.com/watch?v=n2XakurQCgU © 2011 IBM Corporation Dynamic Real-time Model for Galway Bay Nat’l U of Ireland Galway and IBM collaboration Develop model of the water quality of a bay based on the hydro-dynamics of chemical diffusion. Sensors measuring the speed at the water surface will gather data and special streaming software will be used to continuously collect and add new data to recalibrate the model and its predictions. Two goals: 1. proof of concept for building -- a real-time continuous assimilation system (computer system + software) to model situations where real-time data + a model (e.g., traffic, smoke, fire, ...) are important; and 2. the science will inform analysis of the ecological impact of the release of waste water from the County of Donagal waste water treatment plant into its estuary. 14 © 2011 IBM Corporation Unstructured data / Natural language Much of our smart world is built using highly structured data But a large portion of information is unstructured Much is based on natural language -- highly contextual and full of ambiguity. The sheer mass of these unstructured data Difficult for unassisted humans to assimilate Beginning to explore what computers can do to assist 15 © 2011 IBM Corporation Watson Human Language • Ambiguous, contextual, imprecise, and implicit • Contains slang, riddles, idioms, abbreviations, acronyms, … • Seemingly infinite number of ways to express the same meaning 16 © 2011 IBM Corporation