The Ten Habits of Those with the Best Data Thomas C. Redman, Ph.D. Navesink Consulting Group at the Toronto DAMA May 17, 2012 tomredman@dataqualitysolutions.com Redman-Toronto-ten habits-May2012 © Navesink Consulting Group, 2000-2012 T.C. Redman, Page 1 Introduction and Summary Those with the best data: Adopt a customer-facing definition of quality. Aim to prevent errors at the points of data creation, rather than correcting them downstream. Follow “ten habits” that align the entire organization. Enjoy rich rewards for their troubles! Redman-Toronto-ten habits-May2012 © Navesink Consulting Group, 2000-2012 T. C. Redman, Page 2 Agenda What “the best data” looks like Thinking about quality An “non-delegatable” choice The ten habits My (evolving) views on organization Questions anytime Redman-Toronto-ten habits-May2012 © Navesink Consulting Group, 2000-2012 T. C. Redman, Page 3 Market Data Vendor Background: Financial services companies purchase market data from companies such as Reuters, Bloomberg, etc. Lack of trust causes them to purchase basic data from multiple sources. Bank request: Far better data, so it could reduce its vendor base and eliminate downstream costs of bad data. Work conducted: Clear statement of customer needs. Measurement against those needs. Root causes identified and addressed, one at a time. Statistical control. In the course of day-in, day-out work. Redman-Toronto-ten habits-May2012 © Navesink Consulting Group, 2000-2012 TCR, Page 4 Market Data Example: Results 1. Rqmts defined First-time, on-time results Fraction Perfect Records 1 0.9 0.8 2. First Meas 4. Control 0.7 3. Improvements 0.6 0.5 0 5 10 15 20 Month Program start Accuracy Rate ave lower control limit upper control limit target Each error not made saves an average of $500. Quickly millions! The day-in, day-out work of data quality management is conducted at the work group level Redman-Toronto-ten habits-May2012 © Navesink Consulting Group, 2000-2012 TCR, Page 5 Access Financial Assurance at AT&T Background: AT&T expenditures for “access” about $20B/yr. Access Financial Assurance aims to ensure integrity of access bills, through parallel “billing.” Key Idea: Get the bill right the first time. Work conducted: Dissatisfied middle manager, seeking a better way. Top-down deployment. Staff group defined series of deliverables, then audited (regional) compliance. Supplier and process management. Customer needs, measurement, improvement, control. Redman-Toronto-ten habits-May2012 © Navesink Consulting Group, 2000-2012 TCR, Page 6 Results: Access Financial Assurance Data accuracy improved 90%. Billing errors reduced 98%. Cycle time (bill period closure) reduced 67%. AT&T costs (of financial assurance) reduced 73% ($100M/year). LEC costs (of access billing) reduced 20%. There is much hidden “non-value-added work” built in to accommodate bad data. Redman-Toronto-ten habits-May2012 © Navesink Consulting Group, 2000-2012 TCR, Page 7 Enterprise Programme at BT* (British Telecom) Revenue: $33 billion/yr Employees: 95,000 Operates in 170 countries 22 Million customers (4 Million business) Enterprise Data Quality Improvement Programme (10-year effort) Recognized the inherent complexity of people, process, technology issues (e.g., data quality problems masquerading as “systems issues.”) Explicit linkage of data (quality improvement) to strategic business objectives (e.g., business transformation). Over time, magnitude of DQ problem understood and exposed. Governance structure starting at the very top. Consolidated expertise in data quality improvement in IT. Estimated and delivered benefits vetted by Finance. *This summary largely courtesy of Nigel Turner, who led BT’s programme and is now at Trillium Software. He has vetted this summary. Redman-Toronto-ten habits-May2012 © Navesink Consulting Group, 2000-2012 TCR, Page 8 BT – Results Enterprise Data Quality Improvement Programme, cont’d Dual focus on reducing capital expenditure and the rework that results from searching for “lost network facilities.” Problem discovery, measurement, audits, new controls (hold the gains) enhanced by Trillium DQ tool suite. Focused on “big improvement projects” (delivered 75 over ten years). Dual focus on data clean-up and process improvement. Business Benefits: > $1B (verified and conservative) Also improved customer satisfaction, better regulatory compliance, reversed brand damage and revenue leakage, and contributed to business transformation: These benefits not quantified. More than you might think, data permeate everything. Bad data are “silent killers. Redman-Toronto-ten habits-May2012 © Navesink Consulting Group, 2000-2012 TCR, Page 9 High Quality in Your Mind’s Eye Redman’s Favorites Common Characteristics Apple products Relatively few defects. Italian tile C&D Heating and Cooling They are corrected in a prompt, friendly manner. Easy-to-use Disneyland Make it easier to do the things I want to do. Sleek design! Trust the company!! Redman-Toronto-ten habits-May2012 © Navesink Consulting Group, 2000-2012 T. C. Redman, Page 10 Contrast the I-Phone with a Data Model PARTY RELATIONSHIP PARTY to on the other side of PERSON from on one side of an example of ORGANIZATION embodied in PARTY RELATIONSHIP TYPE E.g., "Corporate structure” "Employment" "Membership" etc. Redman-Toronto-ten habits-May2012 Or a financial statement! © Navesink Consulting Group, 2000-2012 T. C. Redman, Page 11 Data Quality Data are of high quality if they are fit for their intended uses (by customers) in operations, decision-making, and planning (after Juran). Data that’s fit for use free of defects: possess desired features: - accessible - relevant - accurate - comprehensive - timely - proper level of detail - complete - easy-to-read - consistent with other sources - easy-to-interpret - etc. - etc. Customers are the ultimate arbiters of quality!! Redman-Toronto-ten habits-May2012 © Navesink Consulting Group, 2000-2012 T. C. Redman, Page 12 Data Quality - aspirational “Exactly the right data and information in exactly the right place at the right time and in the right format to complete an operation, serve a customer, make a decision, or set and execute strategy.”* *Redman, Data Driven: Profiting from Your Most Important Business Asset, Harvard Business Press, 2008 Redman-Toronto-ten habits-May2012 © Navesink Consulting Group, 2000-2012 T. C. Redman, Page 13 Data Quality – day-in, day-out Meeting the most important needs of the most important customers. Redman-Toronto-ten habits-May2012 © Navesink Consulting Group, 2000-2012 T. C. Redman, Page 14 Data Quality: The Non-delegatable Choice Unmanaged To Clean Up The Lake, One Must First Eliminate The Sources Of Pollutant Redman-Toronto-ten habits-May2012 © Navesink Consulting Group, 2000-2012 T. C. Redman, Page 15 They recognize that, left alone, accountability shifts downstream!!! Here’s how you do number 3, son cos2(x) + sin2(x) = 1 Redman-Toronto-ten habits-May2012 © Navesink Consulting Group, 2000-2012 T. C. Redman, Page 16 The (nearly-certain) results Approach Find and Fix (First-Gen) Prevent Future Errors (Sec-Gen) Redman-Toronto-ten habits-May2012 Management Focus Typical Error Rate Cost of Poor Data Quality The Past 1-5% (at the field level) 20% of revenue The Future Two orders of magnitude better Reduced by two-thirds © Navesink Consulting Group, 2000-2012 T. C. Redman, Page 17 Habit 1: Focus on the most important needs of the most important customers Those with the best data adopt a customer-facing definition of quality. In doing so, they recognize that: All data are not created equal. Similarly, customers, problems, and business opportunities are not created equal. Generally, the most important data are those needed to set and execute the company’s most important business strategies. And they focus as much of their energies on these customers, strategies, and data. Said differently, their data quality programs are fully aligned with business strategy. Redman-Toronto-ten habits-May2012 © Navesink Consulting Group, 2000-2012 T. C. Redman, Page 18 Data Doc’s Hierarchy of Needs Many people and organizations exhibit a “hierarchy of “needs” 5. Keep data safe from harm 4. Understand how data fit with other data 3. Understand meaning 2. Trust that data are correct 1. Acquire the data they need Redman-Toronto-ten habits-May2012 © Navesink Consulting Group, 2000-2012 T. C. Redman, Page 19 Habit 2. Process, process, process They recognize that they create data via their crossfunctional business processes A B They recognize that most errors occur “in the white space” Redman-Toronto-ten habits-May2012 C D They think “BIG-P” They recognize “the next guy” (serving the customer) as a customer © Navesink Consulting Group, 2000-2012 T. C. Redman, Page 20 Data - Defined A datum consists of three elements: (entity, attribute, value) The value assigned The thing of interest to the attribute for in the real-world the entity The particular of interest Example: (Jane Doe, Service Record Date = July 1, 1996) Note that, as defined, data are abstract. “Customers” see them as they are presented in tables, databases, graphs, etc. Redman-Toronto-ten habits-May2012 © Navesink Consulting Group, 2000-2012 T. C. Redman, Page 21 Implications… Thus even the simplest datum arises from three distinct sources: The model (entity, attribute)-pair is created within a modeling process, usually by IT or purchased from outside. The data value is created (at enormous rates) by the business process. The presentation may be created by database tools, application programs, PowerPoint presenters, etc in an application development process. All three processes must be managed end-to-end for highquality data to result. Redman-Toronto-ten habits-May2012 © Navesink Consulting Group, 2000-2012 T. C. Redman, Page 22 They use the Customer-Supplier Model to establish requirements and feedback loops requirements requirements inputs outputs “Your Process” Suppliers feedback Customers feedback BIG-P process Redman-Toronto-ten habits-May2012 © Navesink Consulting Group, 2000-2012 T. C. Redman, Page 23 Habit 3: They employ supplier management for external sources of data requirements requirements inputs outputs “Your Process” Suppliers feedback Customers feedback They expect high-quality data from outside. And invest (time) with their suppliers to get them Redman-Toronto-ten habits-May2012 © Navesink Consulting Group, 2000-2012 T. C. Redman, Page 24 Habit 4: They measure quality at the source in business terms They measure continuously Redman-Toronto-ten habits-May2012 Time-Series Record-Level Accuracy fraction records completely correct They define metrics with clear business implications. Private Bank’s Customer Data: Percent of statements with an error Telecom’s Access Charges: Risk = Overbilling + Underbilling Many organizations: Fraction “perfect” records (interpreted as “work” done correctly) 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 1 3 5 7 9 11 13 15 17 19 21 23 25 week They get good at interpreting results They integrate top-line DQ metrics with other business results © Navesink Consulting Group, 2000-2012 T. C. Redman, Page 25 Habit 5: They employ controls at all levels to halt simple errors and establish a basis for moving forward They employ simple edits to stop errors in their tracks: Ex: (Title = Mrs., Sex = M) cannot be correct They employ statistical control to identify process issues early and to look forward: records completely correct Time-Series, Record-Level Accuracy 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 UCL LCL 1 Redman-Toronto-ten habits-May2012 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 week © Navesink Consulting Group, 2000-2012 T. C. Redman, Page 26 Habit 6: They have a knack for continuous improvement records completely correct Time Series, Record-Level Accuracy 0.9 0.8 0.7 0.6 They have a way of not just starting, but completing improvement projects, both to: • eliminate root causes of error 0.5 0.4 0.3 0.2 1 4 7 10 13 16 19 22 25 28 31 34 37 40 42 46 48 • acquire new data week Redman-Toronto-ten habits-May2012 © Navesink Consulting Group, 2000-2012 T. C. Redman, Page 27 Habit 7: Set and achieve aggressive targets They set targets like: • half the error rate every year • add two significant new features every year Time-Series, Record-Level Accuracy records completely correct They focus not just on the level, but also on the rate of improvement 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 1 They decide to position themselves near the front with respect to quality in their industries Redman-Toronto-ten habits-May2012 4 7 10 13 16 19 22 25 28 31 34 37 40 42 46 48 52 55 week In many respects, for them planning for quality is no different than planning for revenue growth, new product development, etc. © Navesink Consulting Group, 2000-2012 T. C. Redman, Page 28 Habit 8: Formalize management accountabilities for data I’ve told that CIO about these data problems a million times! Why can’t they get them right? They recognize that responsibility for data lies with “the business,” not IT. Some codify responsibilities in policy. My favorite (adopted for data): “Don’t take junk data from the guy upstream. And don’t pass junk data on to the next guy!” Redman-Toronto-ten habits-May2012 © Navesink Consulting Group, 2000-2012 T. C. Redman, Page 29 Habit 9: A broad, senior group leads the effort They know that that quality programs go as far and fast as the senior person leading the effort demands. So a broad, committed, senior team leads the effort. “They thought they could make the right speeches, establish broad goals, and leave everything else to subordinates... They didn’t realize that fixing quality meant fixing whole companies, a task that can’t be delegated.” Dr. Juran, 1993 Experience so far is that “data” is even tougher than the factory floor. Redman-Toronto-ten habits-May2012 © Navesink Consulting Group, 2000-2012 T. C. Redman, Page 30 Habit 10: Recognize that the “hard issues are soft” and actively manage change They: Distinguish “I” from “IT.” They recognize that they can’t automate their way out of a quality issue. Start small. Create early wins. Actively manage change. Avoid unwinnable battles, especially early on. Build political capital. Over time, they build data quality into: The organization People’s psyche To new systems Redman-Toronto-ten habits-May2012 © Navesink Consulting Group, 2000-2012 T. C. Redman, Page 31 Who Does the Work?* 8. Formalize management accountabilities for data 9. Broad, informed, demanding leadership 10. Advance a culture that values data and data quality Senior Leadership: Middle Management (Command): 2. Manage processes that create data (so they do so correctly) Work is highly interconnected 3. Manage “suppliers” (both inside the Army and out) of data Everyone who touches data = Four Basic “Steps” 1. Focus on the most important needs (of customers) 4. Measure quality levels against customer needs 5. Deploy controls, at all levels, to remain error-free* 6. Improvement: Find and eliminate root causes of error 7. Set and meet aggressive targets for improvement: top-to-bottom Taken together, the tasks define an overall “Management System for Data Quality” *Ten habits of those with the best data from Redman, Data Driven: Profiting from Your Most Important Business Asset, Harvard Business Press, 2008. Redman-Toronto-ten habits-May2012 © Navesink Consulting Group, 2000-2012 TCR, Page 32 The ten habits reinforce one another* 9. Broad senior leadership Defines Accountabilities via Must Advance 8. Data Policy 10. Manage Data Culture Supports Supports Deployed To Deployed To 3. Supplier Mgmt 2. Process Management Responsible for meeting Leads To Responsible for meeting 1. Customer Needs Monitor Conformance Using 4. Measurement Underlies Everything To Better Meet 5. Control A Platform For Identify "gaps" using 7. Quality Planning 6. Improvement Set Targets For *This figure adapted from Redman, Data Quality: The Field Guide, Digital Press, Boston, 2001 Redman-Toronto-ten habits-May2012 © Navesink Consulting Group, 2000-2012 T. C. Redman, Page 33 The Ten Habits apply to all data, in all industries and government Market, product, and people (customer and employee) data. Intelligence, scientific and logistics data. Health care data. Data created internally or gathered from external sources. Meta-data, master data, enterprise data. Data to be stored on paper, in operational systems, in warehouses, enterprise systems. Client statements, 10-Ks, prospectuses. Data only seen by computers and data that convince people to trust industries and companies (or not). Redman-Toronto-ten habits-May2012 © Navesink Consulting Group, 2000-2012 T. C. Redman, Page 34 Fundamental Organization Unit for Data Quality Leadership Supplier Management Tech Support Manager Process Management Requirements Team Customer Team Measurement Team Measurement Team Control Team Control Team Improvement Teams* *Quality Improvement facilitator is a permanent role, supporting a series of project teams, which disband when their projects complete Redman-Toronto-ten habits-May2012 © Navesink Consulting Group, 2000-2012 T. C. Redman, Page 35 Current “Best” Overall Organization Structure for Data Quality* “PROCESS VIEW:” These structures overlaid on current organization chart Chief Information TECH Office • Technical Infrastructure • Security/Priv acy impl. • Build DQ features into new systems • (One-time) data cleanups Chief Data Office • Day-in, dayout leadership • Secretary to Council • Metadata process owners • Training • Deep expertise • Supplier Program Office Data Council • Leadership • Data Policy • Define process and supplier structure • Advance Data Culture Process A Manage and improve data, following agreed process mgmt methods Improvement project team Follow agreed method to complete assigned project Supplier C Process B (metadata) Manage and improve data, following agreed supplier mgmt process) Supplier D … Other improvement projects … *This figure adapted from Redman, Data Quality: The Field Guide, Digital Press, Boston, 2001 Redman-Toronto-ten habits-May2012 © Navesink Consulting Group, 2000-2012 T. C. Redman, Page 36 Federated Org Structure for Data Quality Org Head Senior Data Board DQ Policy Department Leadership Creators Dpt DQ Team Customers Primary responsibility for DQ Redman-Toronto-ten habits-May2012 Chief Data Office • Policy Deployment • DQ Transparency • Common Methods © NCG and DBP, 2012 - Audit • Audit TCR, Page 37 Department Organization for Data Quality Department Head Head of Data Program Dept Data Training Team Team Dept Data Committee Supplier Business Facilitation Metrics Metadata/ Control Services Mgmt Case Team Team Team Stds team Team Team Team rqmts Functional Activities Org’s Data Board rqmts Data Creation output input coord DQ Data Data facilitator Customer Supplier feedback feedback Support Tech IT: Systems, databases, metadata repository, tools Note: DQ facilitator leads efforts to understand customer needs, conduct improvement projects, etc. Ideally, reports into functional management, with a dotted line to the data team, as pictured Redman-Toronto-ten habits-May2012 © Navesink Consulting Group, 2000-2012 TCR, Page 38 Final Remarks Those with the best data: Adopt a customer-facing definition of quality. Aim to prevent errors at the points of data creation, rather than correcting them downstream. Follow “ten habits” that align the entire organization. Enjoy rich rewards for their troubles! Redman-Toronto-ten habits-May2012 © Navesink Consulting Group, 2000-2012 T. C. Redman, Page 39 What Did He Say? Questions? Thomas C. Redman, Ph.D. +1 732-933-4669 tomredman@dataqualitysolutions.com Redman-Toronto-ten habits-May2012 © Navesink Consulting Group, 2000-2012 T. C. Redman, Page 40