Analytics in

Technology, Data, Analytics
New possibilities in our lives -The important role of tomorrow’s mathematics professionals
Lilian Wu,
Worldwide University Programs Executive, IBM
PSM workshop -- October 14, 2011
© 2011 IBM Corporation
Everything is becoming
Analytics & Optimization
We now have the ability
to measure, sense and
see the exact condition
of practically everything.
People, systems and
objects can communicate
and interact with each
other in entirely new ways.
We can respond to changes
quickly and accurately,
and get better results
by predicting and optimizing
for future events.
© 2011 IBM Corporation
Massive amounts of data being captured on natural and man-made
engineered structures, processes and systems
Volume of Digital Data
Every day, 15 petabytes of new information are
being generated. This is 8x more than the
information in all U.S. libraries.
By 2010, the codified information base of the
world is expected to double every 11 hours.
Importance of
Decision Making
70% of executives believe that
poor decision making has had a
degrading impact on their
companies’ performance
Only 9% of CFOs believe they
excel at interpreting data for
senior management
Analytics, modeling, and visualization of these data can help to run our
systems more effectively
© 2011 IBM Corporation
Types of Analytics
Degree of Complexity
Stochastic Optimization
How can we achieve the best outcome
including the effects of variability?
How can we achieve the best outcome?
Predictive modeling
What will happen next if ?
What could happen … ?
What if these trends continue?
What actions are needed?
Query/drill down
What exactly is the problem?
Ad hoc reporting
How many, how often, where?
Standard Reporting
What happened?
Based on: Competing on Analytics, Davenport and Harris, 2007
© 2011 IBM Corporation
Analytics Skills Areas that IBM and Clients Need
Understanding the types of analytics
Database design
Data collection & mining (finding, cleansing, normalizing)
Database systems (design, implementation, on-line analytics of data)
Rules-based data integration and reduction
Stream computing and computing for multiple, parallel processing
Statistical analysis
Predictive analytics (modeling, simulation, forecasting)
Prescriptive analytics (optimization)
Descriptive analytics (score cards, dashboards, alerts)
Analytics in -- marketing, text, web, risk, transportation, energy, etc.
Risks, privacy, security, legal Implications
Project management
Inference and decision making
Applying analytics to real world problems
Global University Programs
© 2011 IBM Corporation
Vassar Brothers Medical Center
Technology changes how systems and processes work – need mathematical
models to better understand the changes and their consequences
– 365 bed Regional Hospital in Poughkeepsie, NY
– Four Centers of Clinical Excellence
The Heart Institute
Women’s & Children’ Services
The Dyson Center for Cancer Care
Center for Advanced Surgery
– Nurses
• 700
– Physicians
• 520 privileges
– Campus
• 515,000 Sq. Ft.
Multiple Structures
Ranges - 10 to 100 years
– Freestanding Ambulatory Center
• 130,000 Sq. Ft
• 15 miles south
© 2011 IBM Corporation
Hospitals are Complex Systems
 Hospitals
Family members
“Modern medicine is one of those incredible works
of reason: an elaborate system of specialized
knowledge, technical procedures, and rules of
Paul Starr, author of The Social Transformation of
American Medicine
© 2011 IBM Corporation
OR techniques -- model and analyze new processes
RFID tags to track IV pumps
Improve Asset Utilization
IV pumps – nurses typically spent over an hour each day looking for equipment –
resulted in pumps being hoarded
• No being properly cleaned
• Not certified to be pumping the correct amounts
Changed to tagging each pump with an RFID to track location.
Twice daily equipment census and pick up unused pumps collected from central
Reduce over-buying, lost assets
Workflow Optimization
Match equipment/people to need
 Results
Reduce time staff spends looking for missing devices
Pumps cleaned and certified to be pumping the correct amounts
Planned purchase of $0.5M of new pumps – not necessary
© 2011 IBM Corporation
OR -- city planning and management using geographical data
DC Water & Sewer Authority -- Automated scheduling in a user selected zone of the city
User selected
region for
•Number of Crews = 2
•Shifts: 1 day shift per crew
•Objective: Assign as many WO’s as
possible to each crew, while maximizing
the sum of the priority of the WO’s while
meeting constraints of shift duration,
© 2011 IBM Corporation
lunch break & travel time.
Statistical Models -- city operations using historical data
Buildings: Reduce energy use and reduce greenhouse gas emissions
1,400 K-12 Public School Buildings in New York City
150 million sq ft – Joint project w. CUNY
– Static Data (5 years energy consumption, building characteristics, weather)
– Statistical analysis, monitoring, simulation, optimization of energy use, GHG
emissions and retrofit planning with budget constraints
• Technical Challenges
– Processing large volumes of historical data from various sources
– Developing physics-based models and statistical models for energy
– Simulation of energy demand, energy supply, and building operations to
reduce energy consumption, cost and GHG emissions
 Results for March 2011
– Martin Luther King Junior High School reduced its electricity consumption by
35.1% and 216,061 pounds of CO2
– The top 10 winning schools collectively saved 327,003 pounds of CO2 and
average reduction of 16% in electricity consumption
© 2011 IBM Corporation
© 2011 IBM Corporation
Water -- In the last 100 years global water usage has increased at twice the rate of
population growth
Produces table grape, pepper, stone fruit and
citrus varieties on 12,000 acres in California
•Analyzed different irrigation systems (incl. newer drip
systems) impact on crop yields -- decreased water
usage by 8.5% since 2006
•Better matching of farming equipment to specific
harvesting tasks -- decreased fuel consumption by
20% since 2006
© 2011 IBM Corporation
Monitoring and data collection
Natural Water System Management for Galway Bay (Ireland)
 Marine research infrastructure of sensors and
computational technology interconnected across
Galway Bay collecting and distributing information
– coastal conditions
– pollution levels
– marine life
 Streaming real-time information to allow better
decision-support related to:
– Weather threats
– Pollution alerts
– Algal bloom prediction
– Rogue waves, etc
 The monitoring services, delivered via the web and
other devices – used for tourism, fishing,
aquaculture and environment
Adapted from Smart Bay reference documentation
See video at
© 2011 IBM Corporation
Dynamic Real-time Model for Galway Bay
Nat’l U of Ireland Galway and IBM collaboration
Develop model of the water quality of a bay based on the hydro-dynamics of chemical
diffusion. Sensors measuring the speed at the water surface will gather data and
special streaming software will be used to continuously collect and add new data to
recalibrate the model and its predictions.
Two goals: 1. proof of concept for building -- a real-time continuous assimilation system
(computer system + software) to model situations where real-time data + a model
(e.g., traffic, smoke, fire, ...) are important; and 2. the science will inform analysis of
the ecological impact of the release of waste water from the County of Donagal waste
water treatment plant into its estuary.
© 2011 IBM Corporation
Unstructured data / Natural language
Much of our smart world is built using highly structured data
But a large portion of information is unstructured
Much is based on natural language -- highly contextual and full of
The sheer mass of these unstructured data
Difficult for unassisted humans to assimilate
Beginning to explore what computers can do to assist
© 2011 IBM Corporation
Human Language
• Ambiguous, contextual, imprecise, and implicit
• Contains slang, riddles, idioms, abbreviations,
acronyms, …
• Seemingly infinite number of ways to express the
same meaning
© 2011 IBM Corporation