Big Data: Impacts and Benefits
About ISACA®
With more than 100,000 constituents in 180 countries, ISACA® (www.isaca.org) is a leading global provider of knowledge,
certifications, community, advocacy and education on information systems (IS) assurance and security, enterprise
governance and management of IT, and IT-related risk and compliance. Founded in 1969, the nonprofit, independent
ISACA hosts international conferences, publishes the ISACA® Journal, and develops international IS auditing and control
standards, which help its constituents ensure trust in, and value from, information systems. It also advances and attests IT
skills and knowledge through the globally respected Certified Information Systems Auditor® (CISA®), Certified Information
Security Manager® (CISM®), Certified in the Governance of Enterprise IT® (CGEIT®) and Certified in Risk and Information
Systems ControlTM (CRISCTM) designations.
ISACA continually updates and expands the practical guidance and product family based on the COBIT® framework.
COBIT helps IT professionals and enterprise leaders fulfill their IT governance and management responsibilities,
particularly in the areas of assurance, security, risk and control, and deliver value to the business.
Disclaimer
ISACA has designed and created Big Data: Impacts and Benefits (the “Work”) primarily as an educational resource for
governance and assurance professionals. ISACA makes no claim that use of any of the Work will assure a successful
outcome. The Work should not be considered inclusive of all proper information, procedures and tests or exclusive of other
information, procedures and tests that are reasonably directed to obtaining the same results. In determining the propriety
of any specific information, procedure or test, governance and assurance professionals should apply their own professional
judgment to the specific circumstances presented by the particular systems or information technology environment.
Reservation of Rights
© 2013 ISACA. All rights reserved. No part of this publication may be used, copied, reproduced, modified, distributed,
displayed, stored in a retrieval system or transmitted in any form by any means (electronic, mechanical, photocopying,
recording or otherwise) without the prior written authorization of ISACA. Reproduction and use of all or portions of this
publication are permitted solely for academic, internal and noncommercial use and for consulting/advisory engagements,
and must include full attribution of the material’s source. No other right or permission is granted with respect to this work.
ISACA
3701 Algonquin Road, Suite 1010
Rolling Meadows, IL 60008 USA
Phone: +1.847.253.1545
Fax: +1.847.253.1443
Email: info@isaca.org
Web site: www.isaca.org
Provide feedback: www.isaca.org/Big-Data-WP
Participate in the ISACA Knowledge Center: www.isaca.org/knowledge-center
Follow ISACA on Twitter: https://twitter.com/ISACANews
Join ISACA on LinkedIn: ISACA (Official), http://linkd.in/ISACAOfficial
Like ISACA on Facebook: www.facebook.com/ISACAHQ
Big Data: Impacts and Benefits
2
Big Data: Impacts and Benefits
Acknowledgments
ISACA wishes to recognize:
Project Development Team
Richard Chew, CISA, CISM, CGEIT, Emerald Management Group, USA
Keith Genicola, KPMG LLP, USA
Brian Li, Ernst & Young LLP, CFE, CMCON, USA
Jothi Philip, CISA, ACA, CISSP, Bank of England, UK
Tichaona Zororo, CISA, CISM, CGEIT, CRISC, CIA, EGIT | Enterprise Governance of IT (PTY) Ltd., South Africa
Expert Reviewers
Joanne De Vito De Palma, BCMM, The Ardent Group LLC, USA
Russell Fairchild, CISA, CRISC, CISSP, PMP, SecureIsle, USA
Rammiya Perumal, CISA, CISM, CRISC, Sumitomo Mitsui Bank, USA
Lily M. Shue, CISA, CISM, CGEIT, CRISC, CCP, LMS Associates LLP, USA
ISACA Board of Directors
Gregory T. Grocholski, CISA, The Dow Chemical Co., USA, International President
Allan Boardman, CISA, CISM, CGEIT, CRISC, ACA, CA (SA), CISSP, Morgan Stanley, UK, Vice President
Juan Luis Carselle, CISA, CGEIT, CRISC, Wal-Mart, Mexico, Vice President
Christos K. Dimitriadis, Ph.D., CISA, CISM, CRISC, INTRALOT S.A., Greece, Vice President
Ramses Gallego, CISM, CGEIT, CCSK, CISSP, SCPM, Six Sigma Black Belt, Dell, Spain, Vice President
Tony Hayes, CGEIT, AFCHSE, CHE, FACS, FCPA, FIIA, Queensland Government, Australia, Vice President
Jeff Spivey, CRISC, CPP, PSP, Security Risk Management Inc., USA, Vice President
Marc Vael, Ph.D., CISA, CISM, CGEIT, CRISC, CISSP, Valuendo, Belgium, Vice President
Kenneth L. Vander Wal, CISA, CPA, Ernst & Young LLP (retired), USA, Past International President
Emil D’Angelo, CISA, CISM, Bank of Tokyo-Mitsubishi UFJ Ltd. (retired), USA, Past International President
John Ho Chi, CISA, CISM, CRISC, CBCP, CFE, Ernst & Young LLP, Singapore, Director
Krysten McCabe, CISA, The Home Depot, USA, Director
Jo Stewart-Rattray, CISA, CISM, CGEIT, CRISC, CSEPS, BRM Holdich, Australia, Director
Knowledge Board
Marc Vael, Ph.D., CISA, CISM, CGEIT, CRISC, CISSP, Valuendo, Belgium, Chairman
Rosemary M. Amato, CISA, CMA, CPA, Deloitte Touche Tohmatsu Ltd., The Netherlands
Steven A. Babb, CGEIT, CRISC, Betfair, UK
Thomas E. Borton, CISA, CISM, CRISC, CISSP, Cost Plus, USA
Phil J. Lageschulte, CGEIT, CPA, KPMG LLP, USA
Jamie Pasfield, CGEIT, ITIL V3, MSP, PRINCE2, Pfizer, UK
Salomon Rico, CISA, CISM, CGEIT, Deloitte LLP, Mexico
Guidance and Practices Committee
Phil J. Lageschulte, CGEIT, CPA, KPMG LLP, USA, Chairman
Dan Haley, CISA, CGEIT, CRISC, MCP, Johnson & Johnson, USA
Yves Marcel Le Roux, CISM, CISSP, CA Technologies, France
Aureo Monteiro Tavares Da Silva, CISM, CGEIT, Vista Point, Brazil
Jotham Nyamari, CISA, Deloitte, USA
Connie Lynn Spinelli, CISA, CRISC, CFE, CGMA, CIA, CISSP, CMA, CPA, BKD LLP, USA
Siang Jun Julia Yeo, CISA, CPA (Australia), Visa Worldwide Pte. Limited, Singapore
Nikolaos Zacharopoulos, CISA, DeutschePost–DHL, Germany
3
Big Data: Impacts and Benefits
Acknowledgments (cont.)
ISACA and IT Governance Institute® (ITGI®) Affiliates and Sponsors
Information Security Forum
Institute of Management Accountants Inc.
ISACA chapters
ITGI France
ITGI Japan
Norwich University
Socitum Performance Management Group
Solvay Brussels School of Economics and Management
Strategic Technology Management Institute (STMI) of the National University of Singapore
University of Antwerp Management School
ASIS International
Hewlett-Packard
IBM
Symantec Corp.
4
Big Data: Impacts and Benefits
Introduction
Big data is both a marketing and a technical term referring to a valuable enterprise asset—information. Big data
represents a trend in technology that is leading the way to a new approach in understanding the world and making
business decisions. These decisions are made based on very large amounts of structured, unstructured and complex
data (e.g., tweets, videos, commercial transactions) which have become difficult to process using basic database and
warehouse management tools. Managing and processing the ever-increasing data set requires running specialized
software on multiple servers. For some enterprises, big data is counted in hundreds of gigabytes; for others, it is in
terabytes or even petabytes, with a frequent and rapid rate of growth and change (in some cases, almost in real time).
In essence, big data refers to data sets that are too large or too fast-changing to be analyzed using traditional relational
or multidimensional database techniques or commonly used software tools to capture, manage and process the data at a
reasonable elapsed time.
According to COBIT® 5, information is effective if it meets the needs of the information consumer (who is considered
a stakeholder). In the case of big data, the enterprise is the stakeholder, and one of its primary stakes is information
quality. The stakes can be related to information goals in the COBIT 5 enabler model, which divides them into three
subdimensions of quality, described later in this white paper. The better the quality of the data, the better the decisions
based on the data—ultimately creating value for the enterprise. Therefore, big data management must ensure the quality
of the data throughout the data life cycle.
Data are collected to be analyzed to find patterns and correlations that may not be initially apparent, but may be useful
in making business decisions. This process is called big data analytics. These data are often personal data that are useful
from a marketing perspective in understanding the likes and dislikes of potential buyers and in analyzing and predicting
their buying behavior. Personal data can be categorized as:
• Volunteered data—Created and explicitly shared by individuals (e.g., social network profiles)
• Observed data—Captured by recording the actions of individuals (e.g., location data when using cell phones)
• Inferred data—Data about individuals based on analysis of volunteered or observed information (e.g., credit scores)
The primary objective of analyzing big data is to support enterprises in
making better business decisions. Data scientists and other users analyze large
amounts of transaction data as well as other data sources that may be ignored
by traditional business intelligence software, such as web server logs, social
media activity reports, cell phone records and data obtained via sensors. Data
analytics can enable a targeted marketing approach that gives the enterprise
a better understanding of its customers—an understanding that will influence
internal processes and, ultimately, increase profit, which provides the
competitive edge most enterprises are seeking.
The primary objective of analyzing big
data is to support enterprises in making
better business decisions.
This white paper provides an overview of the impact that big data collection and analytics can have on an enterprise.
It identifies potential business benefits, challenges, risk, governance and risk management practices, and provides an
overview of relevant assurance considerations related to big data analytics.
5
Big Data: Impacts and Benefits
Impact of Big Data on the Enterprise
Big data can impact current and future process models in many ways. Beyond a business impact, the aggregation of data
can affect governance and management over planning, utilization, assurance and privacy:
• Governance—What data should be included and how should governance of big data be defined and delivered?
(These topics are explored later in this white paper.)
• Planning—Planning involves the process of collecting and organizing outcomes to:
– Justify process adjustments or improvements which until recently could be identified using specialized research
techniques such as predictive modeling.
– Design a trading program predicated on certain conditions that trigger events.
– Encourage target purchase patterns while a buyer is researching products and services.
– Use location-based information in combination with other collected data to guide customer loyalty, route traffic,
identify new product demands, etc.
– Manage just-in-time (JIT) inventory based on seasonal or demand changes. For example, a manufacturing enterprise
may adjust production levels for a particular item after the part number is not ordered for two consecutive days.
– Manage operations of logistics and transportation firms based on real-time performance.
– Manage unplanned IT infrastructure and policy changes that disrupt the direction of IT support.
• Utilization—Use of big data can vary from one enterprise to another depending on the enterprise’s culture and
maturity. A small enterprise may be slower to adopt big data because it may not have the necessary infrastructure to
support the new processes involved. Companies such as IBM®, Hewlett-Packard Company (HP) and Amazon.com®,
on the other hand, have changed direction over the last few years from selling products to providing services and using
information to guide business decisions. Companies that have embraced big data have made the necessary investments
to become information mavens capable of identifying new product and service demands using data mining—
information that they then turn into a competitive advantage by being the first to market.
Infrastructures built to support big data are also cross-marketed to support cloud computing services, in a way making
customers business partners (causing the rise of phrases such as “frenemies” and “coopetition”). In other words, big
data customers may be competitors in one geometric plane and cooperative partners in another, as with Netflix using
the Amazon.com cloud infrastructure to support its media streaming.
• Assurance—Experience leads enterprises to develop better assurance practices. Once leadership develops a strategy
that leverages big data, the enterprise can focus on defining an assurance framework to control and protect big
data. The main concern for the assurance organization is data quality, addressed by topics such as normalization,
harmonization and rationalization. (These topics are technical and pertinent to publications on tools and techniques, and
are not covered in this white paper.)
• Privacy—Privacy protection has always been handled differently by
geographic regions, governments and enterprises. Laws protect the privacy
Laws protect the privacy of individuals of individuals and any information collected about them, even if people share
confidential information inappropriately, for example, posting nonpublic or
and any information collected about
them, even if people share confidential private information (e.g., pictures of credit cards, birthdays, phone numbers,
personal preferences) in social media outlets. Regardless of the authenticity of
information inappropriately.
information collected from social media, its collection requires protection from
nefarious users as well as over-controlling governments.
6
Big Data: Impacts and Benefits
Business Benefits of Big Data
Big data opportunities are significant, as are the challenges. Enterprises that
master the emerging discipline of big data management can reap significant
rewards and differentiate themselves from their competitors. Indeed, research
conducted by Erik Brynjolfsson, an economist at the Sloan School of
Management at the Massachusetts Institute of Technology (USA), shows that
companies that use “data-directed decision making” enjoy a five to six percent
boost in productivity.1 Proper use of big data goes beyond collecting and
analyzing large quantities of data; it also requires understanding how and
when to use the data in making crucial decisions.
Enterprises that master the emerging
discipline of big data management can
reap significant rewards and differentiate
themselves from their competitors.
Competitive advantage can be greatly improved by leveraging the right data. According a research report by McKinsey,2
the potential value from data in the US health care sector could be more than US $300 billion in value every year,
two-thirds of which would be in the form of reducing national health care expenditures by approximately eight percent.
Financial benefits can be realized when data management processes are aligned with the enterprise’s strategy, which may
require top management involvement to set direction and oversee major decisions.
Big data analytics can positively impact:
• Product development
• Market development
• Operational efficiency
• Customer experience and loyalty
• Market demand predictions
The process for accessing organization-specific commercial insights from big data is shown in figure 1.
Figure 1—Addressing Organization–specific Commercial Insights
Business Benefits
Analyze
Better Decisions
Exhaust Data
Any Data
Acquire
Discover
(social media,
enterprise records,
Data as a
Service [DaaS],
competitor data)
Organize
Predict
Vast amounts of information
collected from every
imaginable source
Plan
1
2
Faster Action
Greater Innovation
Stronger Competitive
Advantage
Swalwell, John; “Big Data and Intelligent Image Capture Platforms,” Technology First, USA, August 2012
Manyika, James; Michael Chui; Brad Brown; Jacques Bughin; Richard Dobbs; Charles Roxburgh; Angela Hung Byers; “Big data: The next frontier for innovation,
competition, and productivity,” McKinsey Global Institute, McKinsey & Company, USA, May 2011
7
Big Data: Impacts and Benefits
Should the enterprise pursue big data wholeheartedly or start small with target opportunities? Buy or outsource? These
are strategies that should be implemented based on the strategic goals and existing capabilities of each enterprise. For
enterprises ready to turn big data from a revenue-hemorrhaging liability into a revenue-enhancing asset, a four-tier plan
is proposed:
1. Take time to strategize—Work with key stakeholders and business units to understand their data needs. Incorporate
their feedback to improve processes across the business.
2. Think analytically—Improve the analytical support team and ensure that managers have the applications and access
they need to examine business-critical information firsthand.
3. Ask for what is needed—Leverage industry-specific applications and software, where available. If needs are not
being met, alert the management team and/or industry suppliers.
4. Invest to improve—Arm the enterprise with the appropriate technology, staff and systems/processes needed to
optimize information for true business intelligence.
Risks and Concerns With Big Data
Enterprises are investing considerable capital to develop and deploy big data analytics and measurement to obtain an early
competitive advantage. Although big data can supply a competitive advantage and other benefits, it also carries significant
risk. Now that enterprises have huge amounts of structured and unstructured data available, management should be asking:
• Where should we store the data?
• How are we going to protect the data?
• How are we going to utilize the data safely and lawfully?
In the following section, risk and concerns associated with big data are highlighted.
The concept of big data risk management is still at the infancy stage for many
enterprises, and security policies and procedures are still developing in many
Inaccurate, incomplete or fraudulently areas. Numerous business executives might not recognize that the faster
manipulated data pose increasing risk as and easier it is to access big data, the greater the risk to all of that valuable
information. For the data to be utilized productively, executives must pay
enterprises become more
dependent on the data to drive decision special attention to corporate data life cycle processes; big data insights are
only as good as the data themselves. According to the COBIT 5 information
making and assess results.
enabler, the full life cycle of information needs to be considered and different
approaches may be necessary, depending on the phase within the life cycle.
The COBIT 5 information enabler identifies four different phases (i.e., plan, design, build/acquire and use/operate).
Inaccurate, incomplete or fraudulently manipulated data pose an increasing risk as enterprises become more dependent
on the data to drive decision making and assess results.
The need to manage data risk within the enterprise may not be clearly communicated and understood at all management
levels. It is essential to point out that addressing big data risk and concerns cannot be seen exclusively as an information
technology exercise. Participation from the entire enterprise, including legal, finance, compliance, internal audit and
other business departments, allows everyone to focus on the business goals in the planning stage. Enterprises can then
focus on both the technical and business aspects of big data.
At times enterprises may resist periodic reviews of big data strategies and security policies and procedures because
top management believes that the current practice is “sufficient” and is reluctant to spend more if it is not “necessary.”
This philosophy, however, is inaccurate. Security and privacy play an increasingly important role in big data, and all
stakeholders should be aware of the implications of storing and cross-analyzing large amounts of sensitive, disparate
8
Big Data: Impacts and Benefits
data. Furthermore, it is imperative to understand that some data should be considered “toxic” in the sense that loss of
control over these data could be damaging to the enterprise. Examples of potentially “toxic” data are:
• Private or custodial information such as credit card numbers, personally identifiable information such as Social Security
numbers, and personal health information
• Strategic information such as intellectual property, business plans and product designs
• Information such as key performance indicators, sales figures, financial metrics and production metrics used to make
critical decisions
Data vulnerabilities are especially acute for enterprises that rely on personal data that are generated or can be modified
by the public. For instance, social media data can be a highly valuable source for assessing customer sentiment, tracking
the effectiveness of marketing campaigns and learning more about consumers. However, utilizing this type of personal
data will require addressing current uncertainties and points of tension:
• Privacy—Individual needs for privacy vary. Policy makers face a complex challenge while developing legislation
and regulations.
• Global governance—There is a lack of global legal interoperability, with each country evolving its own legal and
regulatory frameworks.
• Personal data ownership—The concept of property rights is not easily extended to data, creating challenges in
establishing usage rights.
• Transparency—Too much transparency too soon presents as much of a risk to destabilizing the personal data
ecosystem as too little transparency.
• Value distribution—Even before value can be shared more equitably, more clarity is required on what truly constitutes
value for each stakeholder.
To minimize the potential for damages resulting from inaccurate or fraudulent
data, enterprises should take inventory of all the data sources they are pulling
into their analyses and assess each source for vulnerabilities. Are the data
publicly generated? Who has access to the data at any point before they
enter the analysis? Are there incentives to manipulate the data? In the
case of vulnerable data sources, classification techniques can be employed
to detect potentially fraudulent data points and remove them prior to
further dissemination.
To minimize the potential for damages
resulting from inaccurate or fraudulent
data, enterprises should take inventory
of all the data sources they are pulling
into their analyses and assess each
source for vulnerabilities.
Strategies for Addressing Big Data Risk
The main strategy for addressing risk is aligning the technology solution to business needs. The COBIT 5 framework
addresses this in the goals cascade by aligning stakeholder drivers and stakeholder needs. These needs cascade to the
enterprise goals, then to the IT-related goals, and ultimately to the enabler goals. There are seven enablers that should be
applied to assist the enterprise in addressing risk and improving its ability to meet its business objectives and create value
for its stakeholders.
When new initiatives, such as adoption of big data, are properly aligned to the business, existing governance structures
can be easily adjusted to address security, assurance and a general approach to embracing new technologies. These steps
should include building the talent base, invoking alignment of information security concerns related to big data, and
starting pilot programs to determine whether the need is to build internally or consume benefits of prior big data wisdom.
The COBIT 5 people, skills and competencies enabler, which suggests that the enterprise should know what its current
skill base is and plan what it needs to be, will be helpful in building the talent base.
9
Big Data: Impacts and Benefits
Building the talent base internally is a fundamental cornerstone to best practice. Who can understand enterprise culture,
processes and the behavior of enterprise data better than staff? Power users and their tools are an excellent start to:
• Determine what internal resources and capacities are available to digest existing information.
• Determine what tools are needed to enhance the information acquisition and digestion process.
• Address how information will be used to achieve both tactical and strategic goals, if the determination is made that new
and/or different information is needed.
• Develop or obtain training programs for the team.
• Determine whether a data scientist is needed.
• Establish realistic expectations and create a tactical plan.
Integrating big data analytics into business risk management and security operations is not an easy task. While big data
in general has transformed competitive dynamics in an enterprise, it has also transformed the enterprise’s information
security programs, including how the security programs are developed and executed. It is prudent to set expectations with
stakeholders at every step of the journey. This helps mitigate the risk of losing focus on the “shared vision” of strategic
business alignment.
Risk can also be mitigated by ensuring the quality of the data. The COBIT 5 information enabler guides the enterprise
through the information cycle by suggesting that business processes generate and process data, converting them into
information and knowledge, and ultimately producing value for the enterprise by delivering quality data. The information
enabler also lays out the approach by suggesting that the first step is to identify stakeholders as well as their stakes
(i.e., why they care or are interested in the information). The stakes can be related to information goals. Goals of
information are divided into three subdimensions of quality (figure 2).
Figure 2—Data Quality Subdimensions
Intrinsic Quality
• Accuracy
• Objectivity
• Believability
• Reputation
The immediate adoption of outsourcing
denies an enterprise the intellectual
property it needs to partner, manage and
control the big data journey.
10
Contextual and Representational Quality
• Relevancy
• Completeness
• Currency
• Appropriate amount of information
• Concise representation
• Consistent representation
• Interpretability
• Understandability
• Ease of manipulation
Security/Accessibility Quality
• Availability/timeliness
• Restricted access
Choosing a partner is a major step toward deciding what processes are to
be embraced eventually. It is the “make or buy” decision of every facet of
the journey from training and information protection, to pilot project and
intellectual property transfer. The immediate adoption of outsourcing denies
an enterprise the intellectual property it needs to partner, manage and control
the big data journey. Every enterprise should, at a minimum, experience some
facets of big data to gain knowledge and expertise for future reference. Big data
may change the way enterprises do business, and it will affect their business,
culture and processes. It should also be a catalyst for how the enterprise selects
and changes partners.
Big Data: Impacts and Benefits
Selection is a critical first step and can incorporate several strategies in addition to selection of the big data vendor:
• It can result in a strategic alliance with one or more big data technology providers.
• It can ensure that training classes are taught by practitioners, not by those who cannot answer fundamental questions,
and training infrastructures that support hands-on interaction are used.
• It can ensure that course information is shared with, and thoroughly reviewed by, the big data team.
• The pilot project can encompass the instructor and the company big data team, in recognition that the project is really a
work in progress.
• Third-party processes, project management and goals can be aligned to enterprise goals and expertise.
• Stakeholders in business and risk management can be involved to ensure that appropriate controls are in place with the
third-party vendor/partner.
Once a company knows what it wants, it must determine how to obtain the information it needs. A data broker is a
possible source. Some companies already in the business of brokering information about enterprises include Bloomberg,
Thomson Reuters, Simmons Market Research and The Nielsen Company.
If the enterprise elects to build, it must decide:
• Whether it should use a broker
• Whether to use a training partner for the project
• Whether it takes small steps or giant leaps of faith as it acquires terabytes
• What options are available as partners
• What the project deliverables should be
Project documentation should be a deliverable to:
• Prevent vendor/partner lock-in.
• Demonstrate ownership of intellectual property.
Governance for Big Data
Governance ensures that stakeholders’ needs,
conditions and options are evaluated to determine
balanced, agreed-on enterprise objectives to be
achieved. It further supports setting direction through
prioritization and decision making, and monitoring
performance and compliance against agreed-on
direction and objectives. The scope of an enterprise’s
governance, risk and compliance would most likely be
expanded to create a unified system to consolidate silos
and business functions to enable access of all the data.
Figure 3—End-to-end Governance
Governance Objective: Value Creation
Benefits
Realisation
Risk
Optimisation
Governance
Enablers
Resource
Optimisation
Governance
Scope
The end-to-end governance approach that is at the
foundation of COBIT 5 is depicted in figure 3,
showing the key components of a governance system.
Roles, Activities and Relationships
Source: COBIT 5, ISACA, USA, 2012, figure 8
11
Big Data: Impacts and Benefits
Without a proper data governance
process, big data projects can unleash a
lot of trouble, including misleading data
and unexpected costs.
Without a proper data governance process, big data projects can unleash
a lot of trouble, including misleading data and unexpected costs. The role
of data governance in keeping the big data house in order is just starting to
be understood given the relatively recent emergence of the technology and
its allocation to the IT department. Consequently, governance of big data
environments is at an early stage of maturity and there are few widespread
prescriptions for how to do it effectively. One fundamental problem is that
pools of big data are oriented more to data exploration and discovery than they
are to conventional business intelligence reporting and analysis.
Data governance programs provide a
framework for setting data-usage policies
and implementing controls designed
to ensure that information remains
accurate, consistent and accessible.
Data governance programs provide a framework for setting data-usage policies
and implementing controls designed to ensure that information remains
accurate, consistent and accessible. Clearly, a significant challenge in the
process of governing big data is categorizing, modeling and mapping the
data as they are captured and stored, particularly because of the unstructured
nature of much of the information. Data often come from external sources,
and accuracy cannot always be easily validated; also, the meaning and
context of text data are not necessarily self-evident. For many enterprises,
big data involves a collective learning curve for all concerned: IT managers,
programmers, data architects, data modelers and data governance professionals.
To help ensure that the data are mapped properly, the task should be assigned to a senior data architect whose experience
and IT background will prove invaluable in this complex activity.
During the exploratory phase of big data projects, which defines expected business value and leads to formal initiatives,
enterprises should consider the fundamental questions (as articulated by IBM) within information management:
• Do we fully recognize the responsibilities associated with handling big data?
• How does big data change the traditional concept of information as a corporate asset?
• What are the emerging requirements around privacy?
• How do the big data technologies relate to our current IT infrastructure?
The discussion surrounding big data may raise more questions for the chief information officer (CIO) than he/she is
prepared to answer. Many enterprises justify the lack of adequate governance policies because they believe that big data
is “different” somehow, which is side-stepping the issue. Simply stated, as big data technologies become operational—
as opposed to exploratory—they need the same governance disciplines as applied to traditional approaches to data
management.
When implementing an information governance program, the current (as-is) state should be assessed and the future
(to-be) state should be developed. COBIT 5 can help the enterprise address this task and others inherent in governing
big data, ultimately guiding the enterprise’s efforts to create value by striking a balance between realizing benefits and
maintaining risk at an acceptable level.
Assurance Considerations for Big Data
Controls around big data can be grouped into four categories:
• Approach and understanding
• Quality
• Confidentiality and privacy
• Availability
12
Big Data: Impacts and Benefits
Approach and Understanding
This category addresses demonstrating the right tone at the top of the
enterprise. A critical facet in this effort is the establishment and implementation
of a data policy. The policy (and associated procedures) should define the data
in scope; establish a system of governance and assurance over data quality;
and identify qualitative and quantitative criteria to assess accuracy, reliability,
completeness and timeliness of data. Taking an inventory of all data sources,
assessing vulnerabilities, and implementing policies and procedures will most
certainly cost the enterprise time and money. Such costs are necessary when
managing risk and should be considered the cost of doing business.
The assurance process should begin by creating an inventory of the data.
After the inventory, the data should be classified for sensitivity and relevance
and a data flow created. A process should then be developed to identify
vulnerabilities in the data flow, an activity that begins with creation of a
multidimensional data flow diagram supported by a data dictionary3 that maps
the data landscape across the enterprise. This process should capture internal
and external sources of data, the various automated and manual processes
(e.g., transformation, aggregation) performed on each data set, and their
ultimate destination and use. Each vulnerability identified should be entered
into an established data deficiency governance process for analysis of impact
and probability, an escalation to senior management where necessary,
and a strategic or tactical resolution. In addition, each vulnerability needs
an owner—someone who is responsible for the data.
This category addresses demonstrating
the right tone at the top of the
enterprise. A critical facet in this effort is
the establishment and implementation of
a data policy.
Each vulnerability identified should
be entered into an established data
deficiency governance process for
analysis of impact and probability,
an escalation to senior management
where necessary, and a strategic or
tactical resolution.
Materiality criteria should be established that will enable those responsible for data governance to identify the most
relevant data sets and items on which to focus their efforts. This process will also help create an escalation path for data
deficiency management.
Data Quality
Controls should be established and implemented across the data flow to assess data against the accuracy, reliability,
completeness and timeliness criteria defined in the data policy and associated standards.
Where data are being sourced from a third party, the enterprise should establish a contractually bound process to gain
confidence over the quality of the data. This could be through an independent validation of data quality controls at the
third party or through having independent checks on any material data received.
Ownership and responsibilities associated with each material data set should
be assigned. Appropriate training should be rolled out to all relevant personnel
to make them aware of their data-related responsibilities. For example, two
roles that could be defined are data producer and data consumer. A data
producer provides data to the data consumer according to predefined quality
requirements. The consumer must define and communicate the expected
quality requirements for the data and validate against them when the data are
received. The roles change as data moves across the data flow.
3
Two roles that could be defined are
data producer and data consumer.
A data producer provides data
to the data consumer according to
predefined quality requirements.
he data dictionary should also document all material data items and their relationship with each other, their source, and their usage, so that a consistent
T
understanding can be established throughout the enterprise.
13
Big Data: Impacts and Benefits
Data Confidentiality/Privacy
Through the data risk management process, all sensitive data should be identified and appropriate controls put in place.
The nature of the sensitive information could vary from personal information to competitive secrets. A number of rules
and regulations, such as the 1998 UK Data Protection Act and the US Payment Card Industry Data Security Standards
(PCI DSS), govern how sensitive data should be secured in storage and transit.
Logical and physical access security controls are needed to prevent unauthorized access to sensitive data. This includes
classic Information Technology General Controls (ITGC) such as password settings, masking or partially masking
sensitive data, periodic user access review, firewalls, server room door security, server access logs, administrative access
privileges and screen saver lockout.
Encryption technologies must be used to store and transfer highly sensitive information within and outside the enterprise.
Data Availability
Reliable (i.e., tested) disaster recovery arrangements should be in place to ensure that data are available in accordance
with the data recovery point objective (RPO) and recovery time objective (RTO) criteria defined in a business
impact analysis.
Conclusion
Constant change and innovation are challenges that the enterprise and data science team must manage. Innovation
threatens the traditional “comfort zone” of stability and longevity. Accountability is also a fine line to manage.
The enterprise culture, which either fights
or embraces innovation, requires a big
data leader who understands his/her role
in innovation or enterprise direction.
The enterprise culture, which either fights or embraces innovation, requires
a big data leader who understands his/her role in innovation or enterprise
direction. In addition, the leader must:
• Manage expectations
• Reward behaviors rather than results
• Shield data scientists from the detailed scrutiny of management and investors
• Manage projects
• Communicate well to span the enterprise channels
It is not unusual for various levels of leadership to disagree. Soft skills that stimulate a focus on shared goals and a desire
to avoid failure, rather than the disagreement itself, are needed to navigate conflicts within the enterprise and among the
big data team members.
Additional Resources and Feedback
Visit www.isaca.org/Big-Data-WP for additional resources and use the feedback function to provide your comments
and suggestions on this document. Your feedback is a very important element in the development of ISACA guidance
for its constituents and is greatly appreciated.
14