The Ten Habits of Those with the Best Data

The Ten Habits of Those
with the Best Data
Thomas C. Redman, Ph.D.
Navesink Consulting Group
at the Toronto DAMA
May 17, 2012
tomredman@dataqualitysolutions.com
Redman-Toronto-ten habits-May2012
© Navesink Consulting Group, 2000-2012
T.C. Redman, Page 1
Introduction and Summary
Those with the best data:
 Adopt a customer-facing definition of quality.
 Aim to prevent errors at the points of data creation,
rather than correcting them downstream.
 Follow “ten habits” that align the entire organization.
 Enjoy rich rewards for their troubles!
Redman-Toronto-ten habits-May2012
© Navesink Consulting Group, 2000-2012
T. C. Redman, Page 2
Agenda






What “the best data” looks like
Thinking about quality
An “non-delegatable” choice
The ten habits
My (evolving) views on organization
Questions anytime
Redman-Toronto-ten habits-May2012
© Navesink Consulting Group, 2000-2012
T. C. Redman, Page 3
Market Data Vendor
Background: Financial services companies purchase market data
from companies such as Reuters, Bloomberg, etc.

Lack of trust causes them to purchase basic data from multiple
sources.

Bank request: Far better data, so it could reduce its vendor base
and eliminate downstream costs of bad data.
Work conducted:

Clear statement of customer needs.

Measurement against those needs.

Root causes identified and addressed, one at a time.

Statistical control.

In the course of day-in, day-out work.
Redman-Toronto-ten habits-May2012
© Navesink Consulting Group, 2000-2012
TCR, Page 4
Market Data Example: Results
1. Rqmts defined
First-time, on-time results
Fraction Perfect Records
1
0.9
0.8
2. First Meas
4. Control
0.7
3. Improvements
0.6
0.5
0
5
10
15
20
Month
Program start
Accuracy Rate
ave
lower control limit
upper control limit
target
Each error not made saves an average of $500. Quickly millions!
The day-in, day-out work of data quality management is conducted at the
work group level
Redman-Toronto-ten habits-May2012
© Navesink Consulting Group, 2000-2012
TCR, Page 5
Access Financial Assurance at AT&T
Background: AT&T expenditures for “access” about $20B/yr.

Access Financial Assurance aims to ensure integrity of access
bills, through parallel “billing.”
Key Idea: Get the bill right the first time.
Work conducted:

Dissatisfied middle manager, seeking a better way.

Top-down deployment.

Staff group defined series of deliverables, then audited (regional)
compliance.

Supplier and process management.

Customer needs, measurement, improvement, control.
Redman-Toronto-ten habits-May2012
© Navesink Consulting Group, 2000-2012
TCR, Page 6
Results: Access Financial Assurance





Data accuracy improved 90%.
Billing errors reduced 98%.
Cycle time (bill period closure) reduced 67%.
AT&T costs (of financial assurance) reduced 73%
($100M/year).
LEC costs (of access billing) reduced 20%.
There is much hidden “non-value-added work” built in to
accommodate bad data.
Redman-Toronto-ten habits-May2012
© Navesink Consulting Group, 2000-2012
TCR, Page 7
Enterprise Programme at BT*
(British Telecom)

Revenue: $33 billion/yr

Employees: 95,000

Operates in 170 countries

22 Million customers (4 Million business)
Enterprise Data Quality Improvement Programme (10-year effort)

Recognized the inherent complexity of people, process, technology issues
(e.g., data quality problems masquerading as “systems issues.”)

Explicit linkage of data (quality improvement) to strategic business
objectives (e.g., business transformation).

Over time, magnitude of DQ problem understood and exposed.

Governance structure starting at the very top.

Consolidated expertise in data quality improvement in IT.

Estimated and delivered benefits vetted by Finance.
*This summary largely courtesy of Nigel Turner, who led BT’s programme and is now at Trillium Software. He has vetted this summary.
Redman-Toronto-ten habits-May2012
© Navesink Consulting Group, 2000-2012
TCR, Page 8
BT – Results
Enterprise Data Quality Improvement Programme, cont’d

Dual focus on reducing capital expenditure and the rework that results
from searching for “lost network facilities.”

Problem discovery, measurement, audits, new controls (hold the gains)
enhanced by Trillium DQ tool suite.

Focused on “big improvement projects” (delivered 75 over ten years).

Dual focus on data clean-up and process improvement.
Business Benefits:
 > $1B (verified and conservative)
 Also improved customer satisfaction, better regulatory compliance, reversed
brand damage and revenue leakage, and contributed to business
transformation: These benefits not quantified.
More than you might think, data permeate everything.
Bad data are “silent killers.
Redman-Toronto-ten habits-May2012
© Navesink Consulting Group, 2000-2012
TCR, Page 9
High Quality in Your Mind’s Eye
Redman’s
Favorites
Common
Characteristics

Apple products

Relatively few defects.

Italian tile


C&D Heating and
Cooling
They are corrected in a
prompt, friendly manner.

Easy-to-use
Disneyland

Make it easier to do the
things I want to do.

Sleek design!

Trust the company!!
Redman-Toronto-ten habits-May2012
© Navesink Consulting Group, 2000-2012
T. C. Redman, Page 10
Contrast the I-Phone with a Data Model
PARTY
RELATIONSHIP
PARTY
to
on the other side of
PERSON
from
on one side of
an example of
ORGANIZATION
embodied in
PARTY
RELATIONSHIP
TYPE
E.g.,
"Corporate
structure”
"Employment"
"Membership"
etc.
Redman-Toronto-ten habits-May2012
Or a financial statement!
© Navesink Consulting Group, 2000-2012
T. C. Redman, Page 11
Data Quality
Data are of high quality if they are fit for their intended uses
(by customers) in operations, decision-making, and
planning (after Juran).
Data that’s fit for use
free of defects:
possess desired features:
- accessible
- relevant
- accurate
- comprehensive
- timely
- proper level of detail
- complete
- easy-to-read
- consistent with other sources
- easy-to-interpret
- etc.
- etc.
Customers are the ultimate arbiters of quality!!
Redman-Toronto-ten habits-May2012
© Navesink Consulting Group, 2000-2012
T. C. Redman, Page 12
Data Quality - aspirational
“Exactly the right data and information in exactly
the right place at the right time and in the right
format to complete an operation, serve a
customer, make a decision, or set and execute
strategy.”*
*Redman, Data Driven: Profiting from Your Most Important
Business Asset, Harvard Business Press, 2008
Redman-Toronto-ten habits-May2012
© Navesink Consulting Group, 2000-2012
T. C. Redman, Page 13
Data Quality – day-in, day-out
Meeting the most important needs of the most
important customers.
Redman-Toronto-ten habits-May2012
© Navesink Consulting Group, 2000-2012
T. C. Redman, Page 14
Data Quality: The Non-delegatable Choice
Unmanaged
To Clean Up The Lake, One Must First
Eliminate The Sources Of Pollutant
Redman-Toronto-ten habits-May2012
© Navesink Consulting Group, 2000-2012
T. C. Redman, Page 15
They recognize that, left alone,
accountability shifts downstream!!!
Here’s how you
do number 3,
son
cos2(x) + sin2(x) = 1
Redman-Toronto-ten habits-May2012
© Navesink Consulting Group, 2000-2012
T. C. Redman, Page 16
The (nearly-certain) results
Approach
Find and Fix
(First-Gen)
Prevent
Future Errors
(Sec-Gen)
Redman-Toronto-ten habits-May2012
Management
Focus
Typical
Error Rate
Cost of Poor
Data Quality
The Past
1-5% (at the
field level)
20% of
revenue
The Future
Two orders of
magnitude
better
Reduced by
two-thirds
© Navesink Consulting Group, 2000-2012
T. C. Redman, Page 17
Habit 1: Focus on the most important
needs of the most important customers
Those with the best data adopt a customer-facing definition
of quality.
In doing so, they recognize that:
 All data are not created equal. Similarly, customers,
problems, and business opportunities are not created
equal.
 Generally, the most important data are those needed to
set and execute the company’s most important
business strategies.
And they focus as much of their energies on these
customers, strategies, and data.
Said differently, their data quality programs are fully aligned
with business strategy.
Redman-Toronto-ten habits-May2012
© Navesink Consulting Group, 2000-2012
T. C. Redman, Page 18
Data Doc’s Hierarchy of Needs
Many people and
organizations
exhibit a
“hierarchy of
“needs”
5. Keep data safe
from harm
4. Understand
how data fit with
other data
3. Understand
meaning
2. Trust that
data are correct
1. Acquire the
data they need
Redman-Toronto-ten habits-May2012
© Navesink Consulting Group, 2000-2012
T. C. Redman, Page 19
Habit 2. Process, process, process
They recognize that they create data via their crossfunctional business processes
A
B
They recognize that
most errors occur
“in the white space”
Redman-Toronto-ten habits-May2012
C
D
They think “BIG-P”
They recognize “the
next guy” (serving
the customer) as a
customer
© Navesink Consulting Group, 2000-2012
T. C. Redman, Page 20
Data - Defined
A datum consists of three elements:
(entity, attribute, value)
The value assigned
The thing of interest
to the attribute for
in the real-world
the entity
The particular of interest
Example: (Jane Doe, Service Record Date = July 1, 1996)
Note that, as defined, data are abstract. “Customers” see them as they are
presented in tables, databases, graphs, etc.
Redman-Toronto-ten habits-May2012
© Navesink Consulting Group, 2000-2012
T. C. Redman, Page 21
Implications…
Thus even the simplest datum arises from three distinct
sources:
 The model (entity, attribute)-pair is created within a
modeling process, usually by IT or purchased from
outside.
 The data value is created (at enormous rates) by the
business process.
 The presentation may be created by database tools,
application programs, PowerPoint presenters, etc in an
application development process.
All three processes must be managed end-to-end for highquality data to result.
Redman-Toronto-ten habits-May2012
© Navesink Consulting Group, 2000-2012
T. C. Redman, Page 22
They use the Customer-Supplier Model to
establish requirements and feedback loops
requirements
requirements
inputs
outputs
“Your Process”
Suppliers
feedback
Customers
feedback
BIG-P process
Redman-Toronto-ten habits-May2012
© Navesink Consulting Group, 2000-2012
T. C. Redman, Page 23
Habit 3: They employ supplier management for
external sources of data
requirements
requirements
inputs
outputs
“Your Process”
Suppliers
feedback
Customers
feedback
They expect high-quality data
from outside. And invest (time)
with their suppliers to get them
Redman-Toronto-ten habits-May2012
© Navesink Consulting Group, 2000-2012
T. C. Redman, Page 24
Habit 4: They measure quality at the
source in business terms
They measure continuously
Redman-Toronto-ten habits-May2012
Time-Series Record-Level Accuracy
fraction records
completely correct
They define metrics with clear
business implications.
Private Bank’s Customer Data:
Percent of statements with
an error
Telecom’s Access Charges:
Risk = Overbilling + Underbilling
Many organizations:
Fraction “perfect” records
(interpreted as “work” done
correctly)
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
1
3
5
7
9
11
13
15
17
19
21
23
25
week
They get good at interpreting results
They integrate top-line DQ metrics
with other business results
© Navesink Consulting Group, 2000-2012
T. C. Redman, Page 25
Habit 5: They employ controls at all levels to halt
simple errors and establish a basis for moving forward
They employ simple edits to stop errors in their tracks:
Ex: (Title = Mrs., Sex = M) cannot be correct
They employ statistical control to identify process
issues early and to look forward:
records completely
correct
Time-Series, Record-Level Accuracy
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
UCL
LCL
1
Redman-Toronto-ten habits-May2012
3
5 7
9 11 13 15 17 19 21 23 25 27 29 31 33 35 37
week
© Navesink Consulting Group,
2000-2012
T. C. Redman, Page 26
Habit 6: They have a knack for
continuous improvement
records completely correct
Time Series, Record-Level Accuracy
0.9
0.8
0.7
0.6
They have a way
of not just starting,
but completing
improvement
projects, both to:
• eliminate root
causes of error
0.5
0.4
0.3
0.2
1
4
7 10 13 16 19 22 25 28 31 34 37 40 42 46 48
• acquire new
data
week
Redman-Toronto-ten habits-May2012
© Navesink Consulting Group, 2000-2012
T. C. Redman, Page 27
Habit 7: Set and achieve aggressive
targets
They set targets like:
• half the error rate
every year
• add two significant
new features every year
Time-Series, Record-Level Accuracy
records completely correct
They focus not just on
the level, but also on the
rate of improvement
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
1
They decide to position
themselves near the
front with respect to
quality in their industries
Redman-Toronto-ten habits-May2012
4
7 10 13 16 19 22 25 28 31 34 37 40 42 46 48 52 55
week
In many respects, for them
planning for quality is no different
than planning for revenue growth,
new product development, etc.
© Navesink Consulting Group, 2000-2012
T. C. Redman, Page 28
Habit 8: Formalize management
accountabilities for data
I’ve told that CIO
about these data problems
a million times! Why can’t
they get them right?
They recognize that
responsibility for data
lies with “the business,”
not IT.
Some codify
responsibilities in policy.
My favorite (adopted for
data):
“Don’t take junk data from
the guy upstream. And
don’t pass junk data on to
the next guy!”
Redman-Toronto-ten habits-May2012
© Navesink Consulting Group, 2000-2012
T. C. Redman, Page 29
Habit 9: A broad, senior group leads the
effort


They know that that quality programs go as far and fast
as the senior person leading the effort demands.
So a broad, committed, senior team leads the effort.
“They thought they could make the right speeches,
establish broad goals, and leave everything else to
subordinates... They didn’t realize that fixing quality
meant fixing whole companies, a task that can’t be
delegated.”
Dr. Juran, 1993
Experience so far is that “data” is even tougher than the
factory floor.
Redman-Toronto-ten habits-May2012
© Navesink Consulting Group, 2000-2012
T. C. Redman, Page 30
Habit 10: Recognize that the “hard issues
are soft” and actively manage change
They:
 Distinguish “I” from “IT.” They recognize that they
can’t automate their way out of a quality issue.
 Start small. Create early wins.
 Actively manage change.
 Avoid unwinnable battles, especially early on.
 Build political capital.
 Over time, they build data quality into:
 The organization
 People’s psyche
 To new systems
Redman-Toronto-ten habits-May2012
© Navesink Consulting Group, 2000-2012
T. C. Redman, Page 31
Who Does the Work?*
8. Formalize
management
accountabilities
for data
9. Broad,
informed,
demanding
leadership
10. Advance a
culture that
values data and
data quality
Senior Leadership:
Middle Management (Command):
2. Manage
processes that
create data (so they
do so correctly)
Work is highly
interconnected
3. Manage
“suppliers” (both
inside the Army and
out) of data
Everyone who touches data = Four Basic “Steps”
1. Focus on the
most important
needs (of
customers)
4. Measure
quality levels
against customer
needs
5. Deploy
controls, at all
levels, to remain
error-free*
6. Improvement:
Find and
eliminate root
causes of error
7. Set and
meet
aggressive
targets for
improvement:
top-to-bottom
Taken
together, the
tasks define
an overall
“Management
System for
Data Quality”
*Ten habits of those with the best data from Redman, Data Driven: Profiting from
Your Most Important Business Asset, Harvard Business Press, 2008.
Redman-Toronto-ten habits-May2012
© Navesink Consulting Group, 2000-2012
TCR, Page 32
The ten habits reinforce one another*
9. Broad senior
leadership
Defines
Accountabilities
via
Must
Advance
8. Data Policy
10. Manage
Data Culture
Supports
Supports
Deployed
To
Deployed
To
3. Supplier
Mgmt
2. Process
Management
Responsible
for meeting
Leads
To
Responsible
for meeting
1. Customer
Needs
Monitor
Conformance
Using
4. Measurement
Underlies
Everything
To
Better
Meet
5. Control
A
Platform
For
Identify
"gaps"
using
7. Quality
Planning
6. Improvement
Set
Targets
For
*This figure adapted from Redman, Data Quality: The Field Guide, Digital Press, Boston, 2001
Redman-Toronto-ten habits-May2012
© Navesink Consulting Group, 2000-2012
T. C. Redman, Page 33
The Ten Habits apply to all data, in all
industries and government






Market, product, and people (customer and employee)
data. Intelligence, scientific and logistics data. Health
care data.
Data created internally or gathered from external
sources.
Meta-data, master data, enterprise data.
Data to be stored on paper, in operational systems, in
warehouses, enterprise systems.
Client statements, 10-Ks, prospectuses.
Data only seen by computers and data that convince
people to trust industries and companies (or not).
Redman-Toronto-ten habits-May2012
© Navesink Consulting Group, 2000-2012
T. C. Redman, Page 34
Fundamental Organization Unit for Data Quality
Leadership
Supplier
Management
Tech Support
Manager
Process Management
Requirements Team
Customer Team
Measurement Team
Measurement Team
Control Team
Control Team
Improvement Teams*
*Quality Improvement facilitator is a permanent role, supporting a series
of project teams, which disband when their projects complete
Redman-Toronto-ten habits-May2012
© Navesink Consulting Group, 2000-2012
T. C. Redman, Page 35
Current “Best” Overall Organization Structure for Data Quality*
“PROCESS VIEW:” These
structures overlaid on current
organization chart
Chief
Information
TECH Office
• Technical
Infrastructure
•
Security/Priv
acy impl.
• Build DQ
features into
new systems
• (One-time)
data cleanups
Chief Data
Office
• Day-in, dayout leadership
• Secretary to
Council
• Metadata
process owners
• Training
• Deep
expertise
• Supplier
Program Office
Data Council
• Leadership
• Data Policy
• Define process and supplier structure
• Advance Data Culture
Process A
Manage and
improve data,
following agreed
process mgmt
methods
Improvement
project team
Follow agreed method
to complete assigned
project
Supplier C
Process B
(metadata)
Manage and
improve data,
following agreed
supplier mgmt
process)
Supplier D
… Other improvement projects …
*This figure adapted from Redman, Data Quality: The Field Guide, Digital Press, Boston, 2001
Redman-Toronto-ten habits-May2012
© Navesink Consulting Group, 2000-2012
T. C. Redman, Page 36
Federated Org Structure for Data Quality
Org Head
Senior Data Board
DQ Policy
Department Leadership
Creators
Dpt DQ
Team
Customers
Primary responsibility for
DQ
Redman-Toronto-ten habits-May2012
Chief Data Office
• Policy Deployment
• DQ Transparency
• Common Methods
© NCG and DBP, 2012
-
Audit
• Audit
TCR, Page 37
Department Organization for Data Quality
Department Head
Head of Data Program
Dept Data Training
Team
Team
Dept Data Committee
Supplier
Business
Facilitation Metrics Metadata/ Control Services
Mgmt
Case
Team
Team
Team Stds team Team
Team
Team
rqmts
Functional
Activities
Org’s Data Board
rqmts
Data Creation
output
input coord
DQ
Data
Data
facilitator
Customer
Supplier
feedback
feedback
Support
Tech
IT: Systems, databases, metadata repository, tools
Note: DQ facilitator leads efforts to understand customer needs, conduct improvement projects, etc.
Ideally, reports into functional management, with a dotted line to the data team, as pictured
Redman-Toronto-ten habits-May2012
© Navesink Consulting Group, 2000-2012
TCR, Page 38
Final Remarks
Those with the best data:
 Adopt a customer-facing definition of quality.
 Aim to prevent errors at the points of data
creation, rather than correcting them
downstream.
 Follow “ten habits” that align the entire
organization.
 Enjoy rich rewards for their troubles!
Redman-Toronto-ten habits-May2012
© Navesink Consulting Group, 2000-2012
T. C. Redman, Page 39
What Did He Say?
Questions?
Thomas C. Redman, Ph.D.
+1 732-933-4669
tomredman@dataqualitysolutions.com
Redman-Toronto-ten habits-May2012
© Navesink Consulting Group, 2000-2012
T. C. Redman, Page 40