Panel_3_2013_IMS_Data_Quality_Panel_0918... 3737KB Feb 10

advertisement
Data Quality: What you need to
know to Create and Sustain a
Data Quality Program
Panel Members
 Daniel Wallace Manager, Financial Informatics
Arkansas Blue Cross & Blue Shield
 Gayle Bunn, Data Warehouse Analyst, EDW
Blue Cross and Blue Shield of Idaho
 Amit Bhagat, President & Principal Consultant
Amitech Solutions
2
Data Quality
Panel Objectives:
To share information and insight on:

Overall organizational approach to creating and sustaining data quality program
Panel Presentation
Please provide us with a brief
overview of the overall approach to
creating and sustaining the data
quality program in your organization
What You Need to Know to Create and
Sustain a Data Quality Program
Daniel Wallace
Manager, Financial Informatics
Arkansas Blue Cross & Blue Shield
Contact Info: Phone: 501-396-4090
Email: dpwallace@arkbluecross.com
Agenda
 Creating a Data Quality Program

The People

The Scope

The Processes

The Tools
 Sustaining a Data Quality Program

Policy

Communication

Demonstrate Value
6
Data Quality
 Creating a Data Quality Program

The People
– Knowledge of the Business
– Multidiscipline Staff
– Skill Set
• Ability to handle large and complex datasets
• Ability to test and verify systems processes to understand causes of
data issues
• Ability to query/profile data using SQL, SAS, Excel
• Ability to communicate with business areas and management
7
Data Quality
 Creating a Data Quality Program

The Scope
– Importance of Defining
• Likely to solve a real problem
• Able to quantify value of DQ program
– Where to Begin
• System Level?
• Process Level?
• Subject Area Level?
• Application Level?
• Project Level?
8
Data Quality
 Creating a Data Quality Program

The Processes (Assess, Improve)

Assessment
– Data Profiling
– Define DQ Rules
– Define Measure (from DQ rules)

Improvement
– Data Cleansing
– Improve Processes
– Measure Quality
– Monitor Quality
9
Data Quality
 Creating a Data Quality Program

The Tools
– Purpose/Need
• Understanding your data
• Profiling and Rule Discovery
• Data Standardization
• Data Cleansing
• Metadata Management
– People Manage Data Quality not Tools
10
Data Quality
 Sustaining a Data Quality Program

The Need for a DQ Policy

Policy Guidelines
– Treat Information as a Product/Asset
– Focus on the Business Side
– Define Roles and Responsibilities
– Resolution Management
– Proactive Approach
– Data Standards
11
Data Quality
 Sustaining a Data Quality Program

Communication
– Make/Break your DQ initiatives
– Stakeholders
• Their Role in DQ/DG Program
• Successful DQ program must be done with them
• Include all functional areas that create or use data
• Regular meetings needed
12
Data Quality
 Sustaining a Data Quality Program

Demonstrate Value & Communicate It
– Identify DQ Issue to Target
– Engage Management
– Select Metrics to Measure, Establish Baseline
– Implement Solution

DQ program can mitigate inefficiencies, excessive costs
associate with poor data, compliance risks, improve
customer satisfaction
13
Data Quality
Gayle Bunn
Blue Cross of Idaho
Biography
 Gayle Bunn, MBA, PMP, BSEE
 Data Warehouse Analyst

Enterprise Data Warehouse (EDW)
 Blue Cross of Idaho
 Responsible for EDW Data Quality, Support &
Maintenance, Training, Customer Service, and Data
Governance
 Contact Info:

Phone: (208)331-7487

Email: gbunn@bcidaho.com
15
Current Steps at BCI
1.
2.
3.
4.
Started small – EDW focus
Established data quality workflow
Established 1 automated touch point
Added initial data quality metrics


5.
6.
7.
8.
9.
10.
11.
Timeliness
Completeness
Socialized timeliness
Socialized completeness
Data quality evolved into many flavors
Established S.M.A.R.T. data quality metrics
Performed ongoing process improvement
Major milestone occurred!
Data governance and MDM emerges
1. Started Small – EDW Focus
We need better
data quality!
Data Analyst Community
Enterprise Data
Warehouse
(EDW)
Member
Medical
Dental
Drug
EDW Team
We need to
work together
& discuss
issues!
Data Quality
Review Team
(DQRT)
formed.
2. Established Data Quality Workflow
The data is
still wrong!
Mark
Fixed!
Data Quality Review Team (DQRT)
Faster
please!
Data Analyst Community
Document
data quality
issues
•
•
•
•
Yay!
Enterprise Data
Warehouse
(EDW)
Manual
Fix
EDW Team
Prioritize
Wrong?
SharePoint
List
Title
Description
Assigned To
Resolved
(yes/no)
3. Established 1 Automated Touch Point
TP
Enterprise Data
Warehouse (EDW)
Member
Medical
Dental
Drug
(Check for
missing data)
Stop load if
data is not
complete!
Yay!
Some of
the data is
missing!
Extract
X-form
Can we
have the
data
faster?
TP
Load
1 Automated
Touch Point
= Touch Point
Can we
have more
data?
Cool!
Hard to
please!
We need Service
Level Agreements
(SLA’s)!
4. Added Initial Data Quality Metrics
More Data
Delivered
Yay!
Enterprise Data
Warehouse (EDW)
Member
Medical
Vision
Grouper
Sales
Dental
Drug
Premium
Extract
X-form
TP
We need to
socialize this!
TP
Fix
Timeliness
Jobs completed
on time.
Load
Automate Fix
for Common
Problems
New
Touch
Point
What does
“completeness”
mean?
Very
cool!
Completeness
Amount of data
without noise.
Noise = Missing data in Fact Tables
5. Socializing 1st Metric - Timeliness
EDW SLA - SharePoint
Manual:
Track when
weekly/monthly
jobs complete.
I can tell
when jobs
finish!
SQL Server Reporting Services (SSRS)
Automate:
Graph when jobs
miss SLA.
I can see where
to improve!
6. Socializing 1st Metric - Completeness
SQL Server Integration Services (SSIS) to SharePoint
Track Noise in Fact Tables
Dimension
PK
Automate:
Track noise in
data.
Value
-1
Not Applicable
-2
Error
-3
Missing
-4
Default
Only 2.19%
noise? The data
is more complete
than I thought!
SQL Server Reporting Services (SSRS)
NOISE
Count when
dimension data is
not available in a
Fact record
(PK<0).
Automate:
Graph when
noise issues
occur.
I can see where
to improve!
What do we
mean when we
say data quality
anyway?
7. Data Quality Evolved into Many Flavors
Reconciles
Appropriately
No Noise
(missing data)
Complete
Appropriate
Data
Valid
Correct
Business Rules
Accurate
Consistent
Integrity
I have a data
quality
problem!
Timely
Matches
Source
You mean
opportunity!
What flavor?
On Time
Delivery
Successfully Performed in BCI’s Enterprise Data Warehouse (EDW)
8. Established Data Quality Metrics
Data Quality Metrics
Accuracy
(Reconciles)
• % data loads where
data reconciles
• # accuracy incidents
Consistency
• % data loads where
data matches source
• # consistency incidents
Timeliness
• % data loads delivered
on-time
• # timeliness incidents
Integrity
• % load with Appropriate
Business Rules Applied
• # integrity incidents
Validity
• % loads with appropriate
date range
• # validity incidents
(Match Source)
(Right Time)
(Right Rules)
(Right Data)
Completeness
(No Noise)
• % records without noise
(missing data)
• # noise incidents
Potential Data Quality Metrics
Accessibility
• % of Critical Data Fields
provided
Uniqueness
• % total where duplicate
records exist
Compliance
Efficiency
• # of regulatory
noncompliance data
issues with HIPAA, PHI
• Avg. time taken for data
quality issues to be
resolved
VALUE
9. Performed Ongoing Process Improvement
Accuracy
(Reconcile)
Validate
Enterprise Data
Warehouse (EDW)
Completeness
(no noise)
(matches source)
Extract
TP
Data Sources
Fix!
Enterprise
Service Bus
(ESB)
TP
X-form
Fix!
TP
Load
TP
Fix!
Use
Data Quality
metrics to
identify issues
other TP’s don’t.
Fix!
Fix!
Use Data Quality process
to fix source issues.
10. Major Milestone Occurred
Milestone:
No Issues!
Data Quality Review Team (DQRT)
Finally!
Data Analyst Community
Yay!
Yay!
Yay!
Title
Description
Assigned To
Resolved
(yes/no)
• Data Quality
Area
•
•
•
•
Yay!
Enterprise Data
Warehouse
(EDW)
Yay!
EDW Team
SharePoint
List
Yes!
There’s one in
every crowd!
11. Data Governance & MDM Emerges
Master Data Management (MDM)
Data Governance
Data Quality
Complete
Valid
Accurate
With Success:
The small bird’s
chirp of data
quality was
heard!
Data
Governance is
emerging
around Data
Quality
Consistent
Integrity
Timely
MDM is
emerging
around Data
Governance
Critical Success Factors at BCI
1.
2.
3.
4.
5.
6.
7.
8.
9.
Gain Steering Committee sponsorship
Establish a clear Mission Statement/Purpose
Develop Program Goals for the Team
Establish cross-functional DQRT representation
(including across IS)
Create a non-blame, non-judgmental
environment
Use a divide and conquer approach to issue
resolution (broad participation)
Establish continuous improvement over time
(Rome was not built in a day)
Conduct regular meeting schedule, frequency
dependent on need
Appoint a data quality champion
“Data Quality”: What you need to know to
create and sustain data a quality program
Amit Bhagat
President & Principal Consultant
Amitech Solutions
Contact Info:
Phone: 314-480-6301
Email: Amit.Bhagat@amitechsolutions.com
Agenda
 DQ Symptoms

Use Case
 DQ Myths & Reality
 DQ Design

Approach

Business Need

Define

Profile

Remediate

Sustain
30
DQ Symptoms
 “The data is wrong – I will
do it myself.”
 “We spent $5 million
on the ‘claims’ system and it
still sends incorrect
payments.”
 “We get a different member
month count depending on
whom we ask.”
 “We are not sure if our MLR
is correct.”
31
Use Case
 Business Problem

Ensure accurate risk scoring for membership under the ACA for payment transfer
between carriers.
 Data Profiling

Missing or incorrect diagnosis code in claims data.
 Outcomes

Pay other plans, potentially 2% or more of loss ratio because we may "appear"
healthier than others in our market.

Focus on diagnosis code as a critical data element.
32
DQ Myths & Reality
 Myths

Quality is solved by technology alone.

Quality is an IT problem.

Quality is best fixed at the point of entry.

Quality is the sole responsibility of the data ”owners.”

Quality requires all data to perfect.
 Reality

Quality requires people, process,
culture, and technology to work in
concert.

Quality is a “fit for purpose” process
that delivers the highest data quality
over time.
33
DQ Design: Approach
Business
Need
Member
Retention &
Growth
Function
•Sales
•Marketing
•Customer
Service
Data
Domain
Membership
Attributes
•Email
•Phone
Number
34
DQ Design: 5 Step Process
1. Business
need
2. Define
5. Sustain
Optimal DQ
4.
Remediate
3. Profile
35
1. Business Need
 Determine the scope and business
relevance of DQ effort.
Objective
 Acquire business goals
Business Action
 Identify levers
Information use (levers)
 Identify components
Data Components
Improve member retention by
5%
Increase member satisfaction,
Improve customer service.
Reduce hold time, improve
member portal for self service,
provide mobile app for provider
directory
Identify dissatisfied members
Member Satisfaction Surveys
Customer Service
 Identify candidates
for DQ
Data Candidates
Premium & Claims
Survey
Premium & Claims by Product
Membership, Customer Service
call data
36
2. Define: DQ Objectives & Measures
 Identify completion criteria for current DQ iteration:

Reduce member duplicates by 10%.
 Determine metrics to be developed:

What you are measuring (measure).

When you are measuring (milestone).

Why you are measuring (business impact).
Business Driver
Sample Data Quality Metrics
Accurately calculate the number of net
new members
Number of duplicate members
Number of Members with missing SSN
Number of Members without Member
ID
Number of Members with missing
address
37
3. Profile
 This step determines the exact sources, location, and
types of techniques to use to assess DQ:

Identify specific tools / techniques to be
used.

Review initial measures for relevance
and accuracy.

Verify accuracy of what was intended
vs. actual.
– Analyze data for business rule
conformance.
– Profiling reports are analyzed, and root
causes and business impacts are
identified and reported.
38
4. Remediate: Technology & Process
 Develop the immediate and ongoing technical architecture and process
components required to reduce or eliminate DQ problems.
Process &
Standards
Technology

Category
Sub-Category

Rule
Metric
Consistent application of
process and standards to
outline the expectations
for data quality across
the enterprise.

Develop and
implement
business
processes
Develop work
flows to fix bad
data at source
Develop and
implement data
movement
controls

Source Data Files
Extract
Extract
Verify
Profile
Data

Remediate
Transform
Transform
Metadata
•
•
•
•
•
Data quality
• Integration
Data governance • Rationalization
Audit/Balance
• Reconciliation
Implement Business Rules
Compliance Calculations &
Aggregations
Master &
Reference
Data
Business
Rules
Certify Data
Load
Publish
“Certified Data”
Certified Data Store
Reporting Tool
Apply tools to cleanse
and standardize data in
the ETL process to
ensure required levels of
quality are met.

Use cleansing &
standardization
tools
Develop audit,
balance, and
control
Integrate DQ with
Enterprise
Information
Management
program
39
5. Sustain
 This step covers the culture change, governance, and
ongoing support and progress reporting of the DQ effort.
Change
Management
Data Governance
Users & Risk
Transformers
Data Sources
Market Risk
Low
High
Low
High
Low
High
User Population
Operation Risk
Data Stewards,
Report
Generators, Data
Users, Problem
Resolution, etc.
Data
Sources
and
Enabling
Technology
Certified Data
Environment
Lines of
of Business
Lines
Business
Subject Areas
Data Governance
Report & Metrics Governance
Information Quality Governance
Functional Areas
LOB 1 Data
Content
Owner
LOB 2 Data
Content
Owner
LOB 3 Data
Content
Owner
Finance Data
Content
Owner
Credit Data
Content
Owner
HR Data
Content
Owner
Customer Data
Structure Owner
LOB 1 Customer
Data Steward
LOB 2 Customer
Data Steward
LOB 3 Customer
Data Steward
Finance
Customer Data
Steward
Credit Customer
Data Steward
HR Customer
Data Steward
Deal Data
Structure Owner
LOB 1 Deal
Data Steward
LOB 2 Deal
Data Steward
LOB 3 Deal
Data Steward
Finance Deal
Data Steward
Credit Deal
Data Steward
HR Deal Data
Steward
CoE
Customer
Location
Deal
LOB 1
LOB 2
LOB 3
Finance
Credit
HR
Data Stewards
In coordinated team
Data
Stewards
In
coordinated
team
Invoice
Data Steward
Subject Area Lead (Data Steward)
Provides the framework and
ongoing oversight to enable
effective management.
Implementation of various
culture change management
efforts to sustain data quality
efforts.
40
Summary
 Data quality is a known, “for sure” problem.
 Existing processes that create bad data must be
addressed. Technology cannot be the only road to a
solution.
 People:

Perceptions of “doing bad things” are inevitable.

Manage resistance, politics, priorities.

Culture management mandatory.
 Technology:

Integrate with EIM.

Lots of new stuff!
41
Share Your Experience
Panel Members
 Daniel Wallace Manager, Financial Informatics
Arkansas Blue Cross & Blue Shield
 Gayle Bunn, Data Warehouse Analyst, EDW
Blue Cross and Blue Shield of Idaho
 Amit Bhagat, President & Principal Consultant
Amitech Solutions
42
Question # 1
How does data quality program fit
into your strategy for information
management?
43
Question # 2
Are you able to produce "one version
of the truth" throughout the whole
company, or do various versions
surface from different areas?
What
subject areas are you currently
managing in your data quality
program?
44
Question # 3
Are data definitions established at
the individual, department, or
enterprise level?
Are
you leveraging data
governance program for data
quality? How?
45
Question # 4
Describe what impact data quality
has on the delivery of business
value through analytics and BI?

Tell us how your organization manages data quality and
how it responds to data quality issues (as a matter of
project work, daily operations, planning, etc).

Does your organization have ways of measuring or
quantifying “poor quality” and the results of poor quality
data?
46
Question # 5
In your organization, how do the
various stakeholders around any
given data quality project work
together?
47
Question # 6
Have you integrated master data
in your DQ program?
What
How
was your approach?
did it go?
–Successes?
–Lessons learned?
48
Question # 7
What are your next steps?
New
efforts toward data quality?
49
Download