Analytics in

advertisement
Technology, Data, Analytics
New possibilities in our lives -The important role of tomorrow’s mathematics professionals
Lilian Wu,
Worldwide University Programs Executive, IBM
PSM workshop -- October 14, 2011
© 2011 IBM Corporation
Everything is becoming
Analytics & Optimization
INSTRUMENTED
INTERCONNECTED
INTELLIGENT
We now have the ability
to measure, sense and
see the exact condition
of practically everything.
People, systems and
objects can communicate
and interact with each
other in entirely new ways.
We can respond to changes
quickly and accurately,
and get better results
by predicting and optimizing
for future events.
IT
MANUFACTURING
WORKFORCE
2
SUPPLY CHAIN
CUSTOMERS
TRANSPORTATION
FACILITIES
© 2011 IBM Corporation
Massive amounts of data being captured on natural and man-made
engineered structures, processes and systems
Volume of Digital Data
Every day, 15 petabytes of new information are
being generated. This is 8x more than the
information in all U.S. libraries.
By 2010, the codified information base of the
world is expected to double every 11 hours.
Importance of
Decision Making
70% of executives believe that
poor decision making has had a
degrading impact on their
companies’ performance
Only 9% of CFOs believe they
excel at interpreting data for
senior management
Analytics, modeling, and visualization of these data can help to run our
systems more effectively
3
© 2011 IBM Corporation
Types of Analytics
Degree of Complexity
Stochastic Optimization
How can we achieve the best outcome
including the effects of variability?
Optimization
How can we achieve the best outcome?
Predictive modeling
What will happen next if ?
Simulation
What could happen … ?
Prescriptive
Predictive
Forecasting
What if these trends continue?
Alerts
What actions are needed?
Query/drill down
What exactly is the problem?
Ad hoc reporting
How many, how often, where?
Standard Reporting
What happened?
Descriptive
Based on: Competing on Analytics, Davenport and Harris, 2007
4
© 2011 IBM Corporation
Analytics Skills Areas that IBM and Clients Need















5
Understanding the types of analytics
Database design
Data collection & mining (finding, cleansing, normalizing)
Database systems (design, implementation, on-line analytics of data)
Rules-based data integration and reduction
Stream computing and computing for multiple, parallel processing
Statistical analysis
Predictive analytics (modeling, simulation, forecasting)
Prescriptive analytics (optimization)
Descriptive analytics (score cards, dashboards, alerts)
Analytics in -- marketing, text, web, risk, transportation, energy, etc.
Risks, privacy, security, legal Implications
Project management
Inference and decision making
Applying analytics to real world problems
Global University Programs
© 2011 IBM Corporation
Vassar Brothers Medical Center
Technology changes how systems and processes work – need mathematical
models to better understand the changes and their consequences
– 365 bed Regional Hospital in Poughkeepsie, NY
– Four Centers of Clinical Excellence
•
•
•
•
The Heart Institute
Women’s & Children’ Services
The Dyson Center for Cancer Care
Center for Advanced Surgery
– Nurses
• 700
– Physicians
• 520 privileges
– Campus
• 515,000 Sq. Ft.
Multiple Structures
Ranges - 10 to 100 years
– Freestanding Ambulatory Center
• 130,000 Sq. Ft
• 15 miles south
6
© 2011 IBM Corporation
Hospitals are Complex Systems
Patients
 Hospitals
Doctors
Nurses
Staff
Administrators
Family members
Employers
Insurers
Governments
“Modern medicine is one of those incredible works
of reason: an elaborate system of specialized
knowledge, technical procedures, and rules of
behavior.”
–
7
Paul Starr, author of The Social Transformation of
American Medicine
© 2011 IBM Corporation
OR techniques -- model and analyze new processes

RFID tags to track IV pumps
–
–
–

Improve Asset Utilization
–

IV pumps – nurses typically spent over an hour each day looking for equipment –
resulted in pumps being hoarded
• No being properly cleaned
• Not certified to be pumping the correct amounts
Changed to tagging each pump with an RFID to track location.
Twice daily equipment census and pick up unused pumps collected from central
locations
Reduce over-buying, lost assets
Workflow Optimization
–
Match equipment/people to need
 Results
–
–
–
8
Reduce time staff spends looking for missing devices
Pumps cleaned and certified to be pumping the correct amounts
Planned purchase of $0.5M of new pumps – not necessary
© 2011 IBM Corporation
OR -- city planning and management using geographical data
DC Water & Sewer Authority -- Automated scheduling in a user selected zone of the city
User selected
region for
scheduling
9
Goal:
•Number of Crews = 2
•Shifts: 1 day shift per crew
•Objective: Assign as many WO’s as
possible to each crew, while maximizing
the sum of the priority of the WO’s while
meeting constraints of shift duration,
© 2011 IBM Corporation
lunch break & travel time.
Statistical Models -- city operations using historical data
Buildings: Reduce energy use and reduce greenhouse gas emissions
1,400 K-12 Public School Buildings in New York City
150 million sq ft – Joint project w. CUNY
– Static Data (5 years energy consumption, building characteristics, weather)
– Statistical analysis, monitoring, simulation, optimization of energy use, GHG
emissions and retrofit planning with budget constraints
• Technical Challenges
– Processing large volumes of historical data from various sources
– Developing physics-based models and statistical models for energy
consumption
– Simulation of energy demand, energy supply, and building operations to
reduce energy consumption, cost and GHG emissions
 Results for March 2011
– Martin Luther King Junior High School reduced its electricity consumption by
35.1% and 216,061 pounds of CO2
– The top 10 winning schools collectively saved 327,003 pounds of CO2 and
average reduction of 16% in electricity consumption
10
© 2011 IBM Corporation
11
© 2011 IBM Corporation
Water -- In the last 100 years global water usage has increased at twice the rate of
population growth
Produces table grape, pepper, stone fruit and
citrus varieties on 12,000 acres in California
•Analyzed different irrigation systems (incl. newer drip
systems) impact on crop yields -- decreased water
usage by 8.5% since 2006
•Better matching of farming equipment to specific
harvesting tasks -- decreased fuel consumption by
20% since 2006
12
© 2011 IBM Corporation
Monitoring and data collection
Natural Water System Management for Galway Bay (Ireland)
 Marine research infrastructure of sensors and
computational technology interconnected across
Galway Bay collecting and distributing information
on:
– coastal conditions
– pollution levels
– marine life
 Streaming real-time information to allow better
decision-support related to:
– Weather threats
– Pollution alerts
– Algal bloom prediction
– Rogue waves, etc
 The monitoring services, delivered via the web and
other devices – used for tourism, fishing,
aquaculture and environment
Adapted from Smart Bay reference documentation
13
See video at http://www.youtube.com/watch?v=n2XakurQCgU
© 2011 IBM Corporation
Dynamic Real-time Model for Galway Bay
Nat’l U of Ireland Galway and IBM collaboration
Develop model of the water quality of a bay based on the hydro-dynamics of chemical
diffusion. Sensors measuring the speed at the water surface will gather data and
special streaming software will be used to continuously collect and add new data to
recalibrate the model and its predictions.
Two goals: 1. proof of concept for building -- a real-time continuous assimilation system
(computer system + software) to model situations where real-time data + a model
(e.g., traffic, smoke, fire, ...) are important; and 2. the science will inform analysis of
the ecological impact of the release of waste water from the County of Donagal waste
water treatment plant into its estuary.
14
© 2011 IBM Corporation
Unstructured data / Natural language
Much of our smart world is built using highly structured data
But a large portion of information is unstructured
Much is based on natural language -- highly contextual and full of
ambiguity.
The sheer mass of these unstructured data
Difficult for unassisted humans to assimilate
Beginning to explore what computers can do to assist
15
© 2011 IBM Corporation
Watson
Human Language
• Ambiguous, contextual, imprecise, and implicit
• Contains slang, riddles, idioms, abbreviations,
acronyms, …
• Seemingly infinite number of ways to express the
same meaning
16
© 2011 IBM Corporation
Download