Uploaded by danielmonzonperez

The-DM-Cookbook-FULL

advertisement
THE
DATA MANAGEMENT
COOKBOOK
A POCKET GUIDE FOR IMPLEMENTATION
OF DATA MANAGEMENT
Irina Steenbeek
“All roads lead to data.”
Copyright © 2018 Irina Steenbeek. All right reserved.
Published by Data Crossroads.
www.datacrossroads.com
First edition, 2018.
All rights reserved. This book or any portion thereof may not be reproduced or used in any
manner whatsoever without the express written permission of the publisher except for the
use of brief quotations in a book review.
Book design by Natalia Zhuravska, Dreamalizer.
ISBN: 1984149938
ISBN-13: 978-1984149930
Table of contents
Introduction 4
Setting up the table6
FIND YOUR DRIVER
First course: apéritif
DEFINE YOUR DATA NEEDS8
Second course: appetizers
DIVIDE TASKS AND RESPONSIBILITIES10
Third course: entree
ORGANIZE YOUR DATA HOUSE12
Part 1: Understand your data12
Part 2: Locate your data14
Part 3: Manage data flow
16
Part 4: Improve data quality18
Fourth course: main course
ASSESS THE GAPS20
Fifth course: dessert
KEEP GOING
22
EXTRA MATERIALS:
Case study: Data Quality26
Step-by-step checklist
28
3
Introduction
The amount of data available in your company and in the business environment grows
exponentially. Thus the task of keeping the data in control becomes more and more
difficult with time.
A lot of companies realize that data is an important asset and has to be managed accordingly. They would also like to get value from data. Everyone wants to be ‘data-driven’ these days. What lies beneath this idea, is the wish to make the decision-making
process easier and more effective. It means delivering the required data of acceptable
quality to the decision makers when and where they need it. In short: a lot of companies
understand the vital necessity of proper management of their data. The main question
now is: how put this into practice?
Knowing the potential of your data, and managing it correctly is the key to a successful business. As a result of well-implemented data management, you will be able
to reduce risks and costs, increase efficiency, ensure business continuity and successful
growth.
We propose a 5-step system which will guarantee successful implementation of data
management.
4
In this book, we invite you for a five-course dinner. During each course we will explain
the steps of our 5-step system one by one.
The content and the order of the steps are a result of practical experience, but as any
recipe, they are more of a guideline than a list of strict instructions. Tastes (and businesses) differ, so feel free to choose and adjust, make iterations, or even skip steps, if
that fits your purpose.
A FEW THINGS YOU SHOULD KNOW ABOUT COOKING DATA
1. There are several recognizable international guidelines on setting up data management. These are:
• Data Management Body of Knowledge, 2nd edition (DAMA-DMBOK 2), by DAMA
International1;
• Data Management Capability Assessment Model (DCAM), by EDM Council2;
• The Open Group Architecture Framework, version 9.1 (TOGAF 9.1), by The Open
Group3.
2. The good news is: you do not have to read all of them in detail. Our method contains
the essence of the above-mentioned guidelines. It covers the required feasible minimum of information that you need for an effective implementation of data management in your company.
3. Regardless of their size, most companies are dealing with the same limited list of urgent tasks in the area of data management.
4. You need to define what data management functions are feasible and which match
your company’s profile and requirements. There is no one particular approach that has
been widely accepted by the data management community. You need to find your own
way, with our support, of course!
Good luck, or should we say: ‘bon appétit’!
5
Setting up the table
FIND YOUR DRIVER
Before diving into the development of a data management function, let us align our
understanding of the basics.
TOGAF 9.1 stipulates that ‘business function delivers business capabilities closely
aligned to the organization’4. It also defines ‘capability’ as ‘an ability that organization,
person, or system possesses. Capabilities…typically…require a combination of organization, people, processes and technology to achieve’5.
So the following conclusion derives from these statements: data management is a
business function, and in order for it to operate properly, you need to combine organization, processes, technology, and data all together.
First things first
To activate the growth of your business, you have to understand where exactly you
need to focus your energy on. The key is to choose the most important and influential
business areas for generating the overall success.
For sure, data management business function already exists in your organization, in
some formal or informal shape.
Usually, the idea to set up a formal data management function does not
6
just ‘come up’, it comes from a long process of thinking, analyzing and evaluating the
needs of the company. This all leads to a certain point in time when necessity of setting
up or extending a formal implementation of this function cannot be overlooked any
longer.
Very often, this realization comes along with some urgent challenges. These challenges are to some extent the business drivers we are talking about.
Choosing the right tableware
Probably you already have a few ideas in mind about what the main drivers for your
company might be, but let us assist you in structuring these thoughts. It is crucial to
have a clear goal before you start ‘cooking’.
1. Create a list of all possible drivers. Think of:
• regulatory compliance (i.e. GDPR requirements);
• data quality issues;
• business intelligence and data warehousing activities;
• big data perspective;
• predictive analytics;
• impact analysis of changes in software and reporting practices;
• business continuity.
2. Investigate the most attractive selling points of your data management plan. These
could be:
• cost efficiency reduction;
• risk reduction;
• improvement of organizational efficiency and productivity;
• protection and improvement of the organization’s reputation.
3. Minimize the list to 1-2 most important drivers.
4. Look for ‘sponsors’ for the idea(s) amongst influential stakeholders.
5. Sell your ideas to your management.
After finishing this very first step, you should have a pretty clear idea of the main drivers
and goals for the data management initiative. And hopefully you have already gotten
acquainted with a few main stakeholders, which will come in very handy during the following steps.
7
First course:
apéritif
DEFINE DATA NEEDS
All business units within your company deal
with data in one way or another. They produce
data, transform and transport it, and most importantly - they use it for making decisions. In
every business, data is the main means of communication.
Ingredients
data stakeholders
data challenges
data requirements
data quality
Preparation
BEFORE YOU START
Every company deals with some issues around data. It
is often difficult to find volunteers who would want to
take on the responsibility for managing this. Uncertainty
in such responsibilities will result in a waste of time and
resources.
INSTRUCTIONS
1.Identify all stakeholders within your company. They
are your main partners in making your data manageable. Usually the main stakeholders are top management,
IT, finance, sales, production, and other business units.
2. Collect and align the requirements of different stakeholders. Do not be surprised if all of you have different
needs and requirements for the data, its quality, the frequency of its delivery and the tools and devices involved
in data delivery.
8
RESULTS
As a result of your efforts you will:
• know who the most influential stakeholders are and how to approach them;
• have improved efficiency of your business partnering communication;
• be able to protect your future business and information needs and requirements.
DELIVERABLES
1. Stakeholders’ map and assessment.
2. Communication approach.
3. Business and data requirements.
It is advisable to revise data requirements at least once a year. This has several reasons,
but mainly because almost every year you might receive new requirements from regulators, thus your management reports will require regular updates.
You should organize revision before the the start of the budgeting cycle. By this time,
you will have better insight in additional investments regarding new technologies, applications, projects, developments of reporting, etc.
9
Second course:
appetizer
DIVIDE TASKS AND
RESPONSIBILITIES
Data management is a shared responsibility
between data professionals and others stakeholders6. You need to establish data governance rules, starting with defining the main
data management principles.
Ingredients
data management
principles
data governance
data owner
Preparation
BEFORE YOU START
The finance department often sees data as their responsibility, although you know now that every other department also has their own data needs. Assigning a data
owner and defining a clear set of accountabilities, will
result in more efficient data processing.
shared responsibility
rules, roles, and tasks
INSTRUCTIONS
1. Set up data management principles. There are a lot
of principles which your company can adapt and implement. For data management, you need to choose those
which meet your company goals and culture.
2. Agree on governance rules and procedures. This action will allow you to define accountable functions per
specific data management task. Data governance consists of:
• roles and responsibilities;
10
• data management tasks assigned to specific roles;
• processes and procedures;
• governance bodies.
The step is iterative. During the first stage, when the data management function is not
fully designed, you will comprise a preliminary list of tasks and allocate responsibilities.
Later on you might need to revise this list.
3. Identify the function responsible for data management. Although data management
is a shared responsibility, you still need to decide who will have the ultimate accountability for the coordination of the tasks. There is no agreed vision on the place of the
data management function in the company. Some companies consider it an IT function,
some assign it directly to the CIO (if present), others give the responsibility to the financial business unit. It is your company’s choice.
The most significant challenge for the person in this role, is balancing the (often
contradicting) interests of different stakeholders and guiding all of them to common
success.
4. Identify data-related roles, including the owners’ accountabilities. Aside from ‘data
owners’, you must have heard of ‘system owners’, ‘process owners’, ‘product owners’,
etc. The distinction between such functions is not always clear. It is crucial to align responsibilities of various roles and to assign them to business functions.
RESULTS
As soon as the data management principles, rules and roles are set up and agreed
upon, you will:
• know who you can approach to discuss your data-related (including data quality)
issues and solutions;
• be aware of all the concerns the main data stakeholders have;
• be sure that all the issues with data will be resolved according to an agreed procedure.
DELIVERABLES
1. Data management principles.
2. Data policy, governance bodies, procedures, roles, responsibilities.
3. Matrix: data management tasks vs roles.
4. Matrix: roles vs functions.
11
Third course:
entree
ORGANIZE YOUR
DATA HOUSE
Part 1:
UNDERSTAND YOUR DATA
Ingredients
professional language
internal and external
communication
business glossary
report flow
Clear communication between your colleagues
and business partners is crucial in any business.
You want to spend less time explaining what
you mean and avoid any unnecessary duplications in your reports. How do you achieve this?
Preparation
BEFORE YOU START
Speaking the same language with your co-workers and
external parties is not as easily achieved as it seems.
Usually there are two issues which you can encounter
while communicating on professional level:
• You use the same term as your colleague does, but
it means something entirely different.
• You use different terms which all mean the same
thing.
This is often the case with internal, as well as external
communication, and is also reflected on the quality
of data and reports that are being exchanged. The reports received by the stakeholders might not always be
as comprehensible for them as you intend. And this, in
its turn, can have an undesirable effect on their decision-making process.
This issue creates problems with reconciliation of reports and figures from different departments and building enterprise DWH and BI solution.
12
INSTRUCTIONS
1. Create a business glossary by analysing:
• company policies;
• regulations relevant for your business;
• main reports.
2. Agree with stakeholders on the definitions of terms.
3. Create a catalogue of main management reports circulating in your company.
4. Create a report flow.
RESULTS
By now, you should be:
• aware which and how many reports contain duplicate information;
• able to optimize the number of your reports and remove the duplicate information;
• able to reduce the efforts on reconciliation reports from different departments;
• able to communicate more productively and efficiently with your co-workers and
business partners.
DELIVERABLES
1. Company’s business glossary.
2. Report catalogue.
3. Report flow.
13
Third course:
entree
ORGANIZE YOUR
DATA HOUSE
Part 2:
LOCATE YOUR DATA
Ingredients
critical data elements
applications
a ‘golden’ source
Your dream is that pressing one button will
make all required information magically appear on your screen. Unfortunately, such a tool
does not exist yet. It is not about the tooling
though: the key to easy access to data is knowing its exact location.
Preparation
BEFORE YOU START
You have already identified how many reports are circulating in your company. Not all the data in these reports
is equally critical to your business. Now it is time to minimize your efforts to a feasible minimum by identifying
critical data elements that constitute the reports. It is
also time to establish how many systems and applications are in use in your company.
INSTRUCTIONS
The main task is to identify the location of the most important elements within the applications involved in data
processing, and pinpoint the ‘golden’ sources where the
data elements were initially put into processing.
1. Identify critical data elements that have the biggest
influence on your work results.
14
2. Catalogue all of the main applications, services, and interfaces.
3. Map applications and critical data elements.
4. Identify the ‘golden’ sources for critical data elements.
5. Extend the list of data elements and repeat actions 1 to 4.
RESULTS
By now, you should know how to:
• search for the sources of information faster and more efficiently;
• save time on reconciliation of reports as you can acquire data from the ‘golden’
sources;
• decrease the number of processes, reduce duplicate applications, and save money in the process.
DELIVERABLES
1. Catalogues of:
• (critical) data elements;
• applications;
• ‘golden’ sources.
2. Matrices:
• application vs (critical) data elements;
• ‘golden’ sources vs (critical) data elements.
15
Third course:
entree
ORGANIZE YOUR
DATA HOUSE
Part 3:
MANAGE THE DATA FLOW
Ingredients
data flow
data transformation
data life cycle
data lineage
16
Once in a while you internal audit or external
regulators request you to explain how you derive to certain data in your reports. The same
data can lead to different results, depending
on transformation it undergoes along the way.
The key to knowing your data is understanding
these transformations and the way it travels
starting from the source to the end user.
Preparation
BEFORE YOU START
You and your colleagues from other departments use
the same data, but sometimes it leads to different outcomes. Who is right? It costs a lot of time and effort to
investigate what causes these issues, but such investigations still happen almost daily. Due to new regulations
(i.e., GDPR), the amount of such tasks will only grow.
Supporting the data flow information is one of the
most complicated data-related tasks that many companies deal with. The bigger the company the more complex and costly the solution will be. Every company has
to find a feasible solution depending on its size and resources.
The success lies not in the right software solution. The
key is to get all the staff involved in sharing their information and be willing to make it maintainable.
INSTRUCTIONS
1. Document or find a technical solution for the presentation of the data flow. Data lineage describes the changes that data undergoes from source to its end user. The main
components of data lineage are business processes, application landscape, business
roles, and technical metadata.
2. There are three solutions possible for documentation of data lineage:
Solution 1: Data lineage by design.
This requires highly automated software solutions that exist on the market. There are
several providers that offer such solutions.
Solution 2: Descriptive data lineage.
You analyze and describe your business processes, applications, aggregated data, and
controls. The most important challenges are:
• centralization of the information to make it available to all stakeholders;
• involvement of all stakeholders in the process;
• making this process maintainable.
Solution 3: A combination of solutions 1 and 2.
RESULTS
Making data flow information available will result in:
• saving a lot of time not having to investigate data quality (and other)issues;
• decreased operational risk due to the decreased amount of issues with data quality;
• increased work efficiency of your staff.
DELIVERABLES
The main deliverables will depend on the type of the solution you have chosen. As a
minimum you have to ensure centrally located and well-described documentation that
links the following components of data lineage:
• business processes;
• business roles;
• applications;
• (critical) data elements;
• business rules and controls.
17
Third course:
entree
ORGANIZE YOUR
DATA HOUSE
Part 4:
IMRPOVE DATA QUALITY
Ingredients
data quality
manual corrections
data quality dimensions
measurement criteria
The saying ‘garbage in, garbage out’ is a precise description of the relationship between
the quality of your data and the information
on which decisions are based. You need to ensure the quality of your data in order to make
your business grow and prosper.
Preparation
BEFORE YOU START
Correcting errors made during data input or processing
costs a lot of time and energy, it means extra hours for
the staff doing manual adjustments, repeated corrections at least on a monthly basis, etc. You want your staff
to focus on the analysis of the data, instead of investing
their time in endless corrections.
INSTRUCTIONS
1. Define who will be responsible for data quality within
business units. Unless, of course, you have already done
it while setting up data governance (Step 2). If you have
- well done, now you can focus on the next points!
2. Define critical data elements for which you will examine and improve data quality.
3. Define the measurement criteria for data quality.
18
There are several data quality dimensions such as: accuracy, completeness, consistency, timeliness, etc.7 For the maximum benefit of your company, you need to prioritize
the criteria which reflect the needs of your business the most.
4. Define techniques you wish to use for the ‘root – cause’ analysis. There can be a lot of
different reasons why your data is of insufficient quality. Some of the common sources
of such issues are:
• data entry process;
• data processing, including manual operations;
• system (mal)functions.8
5. Set up a data quality issue log.
6. Start executing internal data quality initiatives.
RESULTS
Working with data of better quality, you will:
• reduce operational risk;
• improve the decision making process;
• save resources on reconciliations and fixing reporting issues.
DELIVERABLES
1. A list of data quality responsibilities assigned to business functions.
2. A catalogue of data issues.
3. Data quality issues resolution plan.
4. Data quality issues resolution business process embedded in daily activities.
5. Fine-tuned analysis techniques for data quality issues.
19
Fourth course:
main course
ASSESS THE GAPS
It is highly probable that some of data management functions already exist in your company.
Think, for example of information security. At
this point you probably have a plan for further
development. Now try to evaluate your current
position. Once you know where you stand, you
can start making changes towards your goal.
Ingredients
business function
data management
capability
gap analysis
roadmap
Preparation
BEFORE YOU START
It is important to be on the same page with your business partners regarding feasibility and necessity of the
changes. Also, you need to know how to achieve required results using minimal resources and effort.
INSTRUCTIONS
You have to make a gap analysis between the current situation and the situation ‘to be’. The gap analysis consists
of the following steps:
1. Define which current business functions in your company are related to data-management.
2. Finalize the list of the data management functions
you wish to develop (the revision we were talking about
on p. 13).
20
3. Find the gaps between ‘now’ and ‘to be’. You can use the check-list in the end of this
book (Appendix 2) to assess which data management capabilities are in need of further
development.
4. Outline a roadmap to close these gaps.
RESULTS
As a result, you and your company’s management will have a clear strategic vision on:
• which data management capabilities your company needs;
• how long it will take to reach the desired situation and what it will cost;
• what your company will gain as the result of this effort.
DELIVERABLES
1. An overview of the required data management functions within your company.
2. A gap analysis between the current and desired future situation.
3. A roadmap.
4. A filled-in check list for data capabilities (see Appendix 2).
21
Fifth course:
dessert
KEEP GOING
You already have an approved roadmap and an
idea of what your company can achieve. It is
time to put theory into practice!
Ingredients
maintenance and
development
Preparation
BEFORE YOU START
All the preliminary preparation is done, stop dreaming
and start doing.
INSTRUCTIONS
There are various approaches you can choose from. You
can work on project base, or you can go Agile-style. It
does not matter which approach you will choose, as
long as you keep the following key success factors in
mind:
• Data management needs to be set up as a business function.
• Data management is a shared responsibility: the
staff from different departments need to be involved on a daily basis.
• Top management has to be the main sponsor and
supporter of the implementation.
• Once set up, it requires permanent maintenance
and development.
22
• Some tasks of data management (i.e. data quality, data flow) are ongoing processes.
• There are not too many widely experienced data management specialists on the
market. It would be wise to train and develop your own staff.
• You need to concentrate on small deliverables that can immediately improve your
current processes and deliver results.
RESULTS
As a result of this last step, you will:
• get a clear vision on the tasks to be done in a different time perspective;
• know which deliverables and results you have to request from your staff;
• know when you can expect improvements in your daily work, which investments
and which reduction of costs you need to plan.
DELIVERABLES
The main deliverable is a clear operational model for data management in your company.
23
Notes
1. DAMA International. DAMA-DMBOK: Data Management Body of Knowledge, 2nd
edition. Technics Publications, 2017.
2. EDM Council. “Data Management Capability Assessment Model, DCAM 1.2.2. (Assessor’s Guide)” EDM Council, 23 Jan. 2018, www.edmcouncil.org/dcam.
3. The Open Group. “TOGAF Version 9.1”, The Open Group Standard no. G116, 2011.
4. TOGAF 9.1, 23.
5. TOGAF 9.1, 23.
6. DAMA-DMBOK 1, 5.
7. DAMA-DMBOK2, 465.
8. DAMA-DMBOK2, 467-469.
Works cited
DAMA International. DAMA Guide to the Data Managemen Body of Knowledge, 1st
edition. Technics Publications, 2010.
DAMA International. DAMA-DMBOK: Data Management Body of Knowledge, 2nd
edition. Technics Publications, 2017.
EDM Council. “Data Management Capability Assessment Model, DCAM 1.2.2. (Assessor’s Guide)” EDM Council, 23 Jan. 2018, www.edmcouncil.org/dcam.
The Open Group. “TOGAF Version 9.1”, The Open Group Standard no. G116, 2011.
24
Extra
materials
CASE STUDY: DATA QUALITY
STEP-BY-STEP CHECKLIST
Appendix 1
Case study
DRIVER: DATA QUALITY
Define data needs
1. Define main stakeholders that have concerns regarding data quality (DQ) issues.
2. Identify communication strategies with the stakeholders.
3. Document stakeholders’ data-related needs and requirements.
Divide tasks and responsibilities
4. Define the scope of your company to be involved.
5. Capture data (management) principles as the basis for DQ governance.
6. Make self-assessment of current data management capabilities.
7. Identify required DQ management processes, procedures, tasks, and roles.
8. Develop or adjust data management roadmap, strategy, policy with regard to
DQ tasks.
Organize your data house
9. List reports that should be in scope.
10. Identify critical data domains and elements.
11. List reports involved in the scope.
12. Create a DQ issues log.
13. Specify definitions of critical data elements by putting them into
26
the business glossary and or by creating conceptual or logical data models.
14. Prepare an action plan for DQ issues resolution according to the DQ governance
procedures.
15. In order to execute root-cause analysis, you need to document data lineage for the
critical data elements. This process can be broken down into the following steps:
a. identifying related business processes;
b. creating a catalogue of applications;
c. identifying location of data elements in applications by documenting database
metadata;
d. documenting business rules;
e. analyzing existing business processes and data quality controls.
16. Identify data quality requirements of different data users.
17. Align your activities with main stakeholders, especially IT.
18. Clean historical data if needed.
19. Develop data quality checks and controls, based on DQ requirements.
20. Adjust existing data processing flows.
Assess the gaps
21. Re-assess the initial plans through gap analysis.
22. Based on gap analysis, verify the feasibility of the chosen approach.
Take action
23. Realize your strategy for fit-for-purpose data delivery with the required level of
quality.
27
Appendix 2
Check list
DRIVER
Regulatory compliance, i.e. GDPR
Data quality issues
Implementation of advanced analytics techniques
Improvement of business and financial planning and forecasting
Other,
DATA NEEDS
Stakeholder map
Stakeholder assessment
Stakeholder communication approach
Business needs and requirement
Data needs and requirements
TASKS & RESPONSIBILITIES
Data management principles
Data policy
Data governance roles and responsibilities
Data governance procedures
Data management tasks vs roles
Data roles vs organizational functions
28
DATA HOUSE
Company business glossary
Report catalogue
Report flow
Catalogue of (critical) data elements
Data models (conceptual and logical)
Catalogue of applications
Data dictionary (physical data model)
Catalogue of ‘golden’ sources
Matrix application vs (critical) data element
Matrix ‘golden’ source vs (critical) data element
Descriptive data lineage (including business processes, roles, applications, data elements, business rules and controls)
List of data quality responsibilities assigned to business functions
Catalogue of data issues
Data quality resolution plan
Data quality issues resolution process
Fine-tuned data quality issues analysis techniques
THE GAPS
Gap analysis between existing and desired (future) data management
functions and tasks
Roadmap on data management development
FINAL PRODUCT
Implemented operational data management function
29
A lot of companies realize that data is an invaluable asset and has to be managed accordingly. They would also like to get value from data. Everyone wants to be ‘data-driven’ these days. What lies beneath this idea, is the wish to make the decision-making
process easier and more effective. It means delivering the required data of acceptable quality to the relevant decision makers when and where they need it. In short: a
lot of companies have the necessity to manage their data properly. The main question
is: how do you put this in practice?
Knowing the potential of your data, and managing it correctly is the key to an effective
and successful business. As a result of well-implemented data management, you will
be able to reduce risks and costs, increase efficiency, ensure business continuity and
successful growth.
In this book, we invite you for a five-course dinner. During each course we will explain
the steps of our 5-step programme which guarantees successful implementation of
data management.
ABOUT THE AUTHOR
Dr. Irina Steenbeek is a dedicated and tenacious Senior Data Management, Finance and
IT Professional with 15+ years of extensive
experience. Her areas of expertise are data
management, software implementation, financial and business control, project management, business process re-engineering,
and management consulting and training.
Throughout the years, she has worked for various medium and large multinational organizations, among which The World Bank, ABN
AMRO Bank, Amsterdam Trade Bank, and International Card Services (ICS).
In 2016 she has founded Data Crossroads - a consulting agency in the area of data
management and predictive analytics. She has developed several models for implementation of data management which are based on industry reference guidelines
and are universal for every business. Her approach is highly customizable, and ensures effective results in any type of organization. Data Crossroads connects experts within various industries in order to ensure the highest quality of consultation
to all clients.
ISBN: 1984149938
ISBN-13: 978-1984149930
Download