10 Questions To Align Technology with Business

W H I T E PA P E R
10 Questions To Align
Technology with Business
Requirements for Your Big
Data Project
By Neil Raden
Datameer
Most companies are looking at big data
analytics as a must-have capability to
remain competitive, but the industry is
moving so quickly, it is difficult to find
best practices or good guidance about
how to proceed.
Datameer
10 Q U E S T I O N S T O A N S W E R B E F O R E S TA R T I N G A B I G D ATA A N A LY T I C S P R O J E C T
W H I T E PA P E R
Why Alignment Matters
Big data projects have the potential to turn the existing
relationships between technology teams and knowledge
workers upside-down. Lines of business are taking the lead
with both initiating and directing these big data projects, with
IT providing support but not, for the most part, specifications
and design.
Previously in the world of Business Intelligence and Enterprise Data Warehouses, this
process was reversed: IT assisted with modeling the data based on business requirements,
building the databases and populating the data according to the model schema.
Today, data architectures are built on the fly as potentially interesting data is sourced and
models are essentially driven from the data. Not surprisingly, this role-reversal and general
lack of alignment has introduced challenges -- in the big data solution buying process,
in the implementation, and even the proving of ROI. This change in roles, and alignment
on how a new big data solution furthers specific business initiatives, needs to be
understood by all the parties before investing in a big data solution.
Following are 10 questions to ask your counterpart in business or IT to help align your
big data initiatives with the goals of the organization.
1. Are there important questions, or ongoing analyses, that we have not
been able to address either with current tools and infrastructure, or from a
lack of resources?
Until recently, most enterprise analytics initiatives followed a similar structure: data was
gathered from various, mostly internal sources, integrated into a common logical model
and housed in a series of relational databases. Traditional systems have often taken a long
time to deploy, but once in place are highly efficient at answering fairly standard business
questions like “How much of Product X are we selling by region? By month? By channel?”
What traditional BI processes are not as good at answering are questions like “How can
we predict which customers are most likely to churn?” or “how can we understand which
customer interactions are most valuable in helping to turn a prospective customer into a
paying customer?” If these more sophisticated questions are among the most important
for your business and cannot be answered by your existing systems, now is the time to
investigate big data.
PA G E 3
Datameer
10 Q U E S T I O N S T O A N S W E R B E F O R E S TA R T I N G A B I G D ATA A N A LY T I C S P R O J E C T
W H I T E PA P E R
Today’s computing economics and self-service analytic platforms make it possible to
discover the answers to problems that were not conceivable even a few years ago. Even
the end-to-end process is vastly simplified with better tools and methodologies.
Think about how you’ll address resource constraints if you have them. You won’t be
able to do more with new technology until you can put resources behind it. Thankfully,
modern approaches and platforms can offload some of what used to be the “IT backlog,”
but business users and analysts, are busy too, so have a plan in place.
2. Do we know what others in our industry are doing? Will we begin with
fairly standard industry metrics or models, or attempt to break new
ground?
Unless your business is so exotic or so unique, it is likely that others in your industry
have already addressed some of the really important industry problems successfully.
Assistance from third parties can accelerate the first project. For example, there are
canonical applications for telco companies, which all routinely develop models for churn,
financial services firms for fraud and retail operations for customer intimacy. Though each
may have its own distinguishing characteristics, the models, methods, data and analysis
can be similar to a large extent and can inform your project with what is already known.
Excellent sources of insight are available online in professional discussion groups.
However, if you are “breaking new ground,” it will take some skill and work to estimate the
level of effort you will need.
3. What was our experience with BI? How can that experience provide
guidance toward organizing our big data analytics project?
BI implementations provide many useful benefits, but they have historically not been
adopted by large numbers of knowledge workers in organizations. It is important to
understand why this happens and to ensure that your big data analytics will be widely
used by many types of workers.
While BI projects can be highly successful, when examining the reasons behind projects
that don’t achieve their intended benefits, recurring themes emerge that you can use to
improve your likelihood of success with big data analytics. Items such as sustainable
executive sponsorship, direction from the business community, focusing on “quick wins”
or “low hanging fruit” to draw some positive attention, and a commitment to improving
everyone’s skills are attributes of highly successful BI projects.
PA G E 4
Datameer
10 Q U E S T I O N S T O A N S W E R B E F O R E S TA R T I N G A B I G D ATA A N A LY T I C S P R O J E C T
W H I T E PA P E R
BI use is stratified. To be effective, it requires a contribution from both IT and those
types of workers who straddle the line between IT and business analysis. This creates
a hierarchy of use, where there are a small number of the “power users,” which usually
means they understand the data and models, have mastered the advanced capabilities
of the tools and know how to get more capability from IT when necessary. But those
requests take time. Less informed analysts (with respect to the BI technology) rely on
the power users to get their requests handled and utilize only a fraction of the software
capability. Still others consume the output of the second group and transfer it to
presentations and spreadsheets.
Big data analytics can avoid this stratification with better software and flow-through
architecture, from data ingestions to analysis with far less dependence on IT. Your
project should see that your staff is ready to gain the expertise they need to make the
investment valuable, but some diligence is required to ensure that old habits of 3-tier BI
do not persist.
Some organizations are so traditional in their work, reporting relationships, time-tovalue, etc., that there is little motivation to gain new skills. It’s a good idea to take the
temperature of the organization to see if they are ready act on new insights gained from
new processes.
4. What type of “data problem” do we really have? Do we know what data
we need, where it is located and what format it is in?
Big data does not always imply great volumes. It may simply be that the number of data
sources, or their structure, do not fit the old data warehouse model. Questions you need
to ask are:
• Can you reasonably predict its volume?
• Is there data complexity - will you only need fairly structured data or are there other
diverse types that you will need to learn to work with?
• Will you use external sources such as syndicated data or data from partners that you
haven’t used previously?
Using Hadoop-native big data analytic platforms, data can flow from ingestion all the
way through to end-user analytics without creating an architecture of many intermediate
repositories and proprietary tools.
The core difference between an EDW architecture and carefully designed Hadoop
Analytics approach is the ability to “model on the fly” (sometimes called “schema on
read”), which provides tremendous flexibility and the ability to quickly iterate through
the insight discovery process. This is especially true if the data is in unstructured
and variable formats. This allows the Hadoop-native analytic platform to figure out its
structure for analytic purposes without requiring the IT team to get involved.
PA G E 5
Datameer
10 Q U E S T I O N S T O A N S W E R B E F O R E S TA R T I N G A B I G D ATA A N A LY T I C S P R O J E C T
W H I T E PA P E R
For a first project, this type of investigation need not be comprehensive, but spending
some time thinking about it and investigating the skill and effort that might be needed
later would be time well spent. Remember, unlike traditional data warehousing and BI,
you are not punished for augmenting your data sources over time. In a capable big data
analytics architecture, it is assumed that data and analyses are dynamic.
5. What will we do with the analytic results? What do we expect to gain
from this? Will our culture be able to absorb results from advanced
analytics and machine learning?
Where will the value come from? Process efficiency? Customer retention or up-sell?
Marketing productivity? Higher customer conversion? More revenue from existing
customers? Increased sales productivity? Better customer experience across all
channels? Improving service productivity and time to resolution? What approach do you
plan to take, and do you have the skills in place to execute on it, at least on a pro forma
basis?
Looking at financial measures like ROI or IRR is always a little informal because
analytics, the process of informing people or processes, does not on its own always
show up in the balance sheet. It takes other processes that benefit from their analytics
to show true return. In fact, even measuring costs and benefits of an analytics solution is
pretty murky.
You also need to ask: how will you assist your organization to adopt new processes
born of learning from big data analytics? Developing spreadsheets and presentations
for meetings and discussion and agreement based on rows and columns of numbers is
quite a bit different from basing decisions on statistical models, probability, credibility
and causation.
New cultural norms are sometimes necessary as new insights are discovered. Cultural
attributes that can undermine success with analytics include:
• Using analytics to assign blame, “hold people accountable,” or apply pressure. In that
model, the organization will tend to resist broader adoption and deeper investigation of
analytics.
• “Explaining away” the findings of the analysis. Environmental factors and
organizational evolution can sometimes create “false positives” in analytics. That said,
if most analytical insights are regularly challenged, your organization likely has either
dirty data, or a culture that prefers managing from gut feel and guesswork rather than
being data-driven.
PA G E 6
Datameer
10 Q U E S T I O N S T O A N S W E R B E F O R E S TA R T I N G A B I G D ATA A N A LY T I C S P R O J E C T
W H I T E PA P E R
6. Will we be able to control confirmation bias?
With lots of diverse data, it is fairly easy to prove almost any hypothesis. So you need
to ask yourself: will we be able to implement a good process that helps us discover the
answers the data shows, not simply confirm a theory?
Rather than starting with a hypothesis and trying to gather data to support it, try to
“listen to the data” with an open mind. Beyond that, if others in the organization perceive
that analysis has been provided with an “agenda” behind it, they will tend to reject the
data and any conversation or idea predicated on the “suspect” analysis.
7. Will we deliver the application using a cloud, or on-premise?
Decisions about cloud versus on-premise are important with respect to capacity,
flexibility, cost, reliability, experience and a host of others factors. Cloud solutions will
tend to be more attractive when organizations are lacking:
• Capital budget to procure hardware and license software, and can be funded from
operational expense (OPEX)
• Technical skills sets required to deploy, integrate, and manage various components of
the system
• Time for a full procurement and deployment process, giving the organization a “short
cut” to value
If big data analytics is potentially your first serious initiative in the cloud, these issues
require careful consideration:
• Security – how secure is the cloud solution architecture and infrastructure, and do you
have the appropriate procedures in place internally to maintain ongoing security?
• Compliance – certain countries have regulations on securing specific data sets and
maintaining the data within that same country (Safe Harbor laws, etc.). Does your
cloud solution provider support this?
• Ownership and administration – if the system is in the cloud, will the business users
maintain it or will IT still have to manage it? The expected approach should be agreed
upon before embarking on a cloud deployment.
PA G E 7
Datameer
10 Q U E S T I O N S T O A N S W E R B E F O R E S TA R T I N G A B I G D ATA A N A LY T I C S P R O J E C T
W H I T E PA P E R
8. How will we know that it worked? If management asks in 6 months, ‘Was
it successful,” how will we answer?
Your planning will surely include some financial metrics (decommissioning existing
proprietary solutions, or at least avoiding costly upgrades, measurable improved
performance in any number of ways) but think about non-financial improvements as well.
One life insurance company saw 50% annual turnover in their actuarial department drop
to almost zero when the actuaries were finally able to do creative work instead of chasing
data from legacy systems and trying to reconcile it. A charitable organization saw its
fundraising costs drop from 22% to less than 10% by discovering the most effective
programs and donors.
Management loves numbers, so be sure to use some metrics. But we’d also suggest
some mention of engagement of the staff, increase in analytical skills and output and
especially cycle time for new analyses.
What did the organization learn that it didn’t know before? How have processes changed
given new insights? What hard ROI can be demonstrated? What would happen from a
business and user perspective if the system was decommissioned tomorrow? Is there a
next phase planned? Why or why not?
9. If we don’t yet have ongoing funding, do we understand where that
funding will need to come from and what will be required to justify it?
After all, even the most successful and wealthy companies regard investment as an
exercise in scarce resources, so don’t end up stranded and unable to meet your goals
for lack of funds. Of course, implementing in the cloud bypasses, to some extent, the
capital budget merry-go-round and may be a good alternative if capital funding is not a
possibility.
It can be easier to get organizational funding for a pilot or proof-of-concept. However,
it can take meaningful time and effort to deliver a pilot. It’s important to document what
the pilot is supposed to achieve, with a clear and specific use case, and “ball park” ROI.
There’s no point in spending time on the pilot if it will ultimately be shut down for lack
of ongoing funding, even when it demonstrates value. Understand what organization(s)
would be expected to fund the project post-pilot, and get clear agreement on the
expectations of the pilot, and the proof-points that will be required for ongoing funding.
PA G E 8
Datameer
10 Q U E S T I O N S T O A N S W E R B E F O R E S TA R T I N G A B I G D ATA A N A LY T I C S P R O J E C T
W H I T E PA P E R
10. Have we spent enough time understanding the level of effort to gather
and integrate data we’ve never seen before?
Big data, whether of great scale or diversity, or both, poses problems not seen in existing
analytical environments. ETL tools used for data warehousing are largely process
scripting and monitoring tools and weak on data that isn’t neatly structured. But with
big data, you will need skills and tools for data that is, frankly, a little messy. That may
require some machine learning (ML) as well as orchestration of the ingestion process,
whether streaming or in batch. Use a Proof of Concept to get a sense of what you are
facing in terms of time, skill and expense.
ABOUT THE AUTHOR
Neil Raden is an author, consultant and industry analyst, a featured figure internationally
and the founder and Principal Analyst at Hired Brains Research, an industry analyst and
consulting firm specializing in the application of data management and analytics. Hired
Brains focuses on the needs of organizations and capabilities of technology by providing
context to the often-bewildering choices facing organizations. Rather than producing market
studies ranking companies, Hired Brains guides organization through a process that stresses
value and realistic planning.
Beginning with his work as a property and casualty actuary with AIG in New York before
forming Hired Brains in 1985 to deliver predictive analytics services, software engineering,
and systems integration, he gathered experience in delivering environments for decision
making in fields as diverse as health care to nuclear waste management to cosmetics
marketing and many others in between.
With a mixture of research and advisory work infused with experience working with clients
on real projects to provide context to the industry, Hired Brains assists both providers and
consumers of technology. He welcomes your comment nraden@hiredbrains,com.
PA G E 9
FREE TRIAL
datameer.com/free-trial
T W I T T E R
@Datameer
LINKEDIN
linkedin.com/company/datameer
©2016 Datameer, Inc. All rights reserved. Datameer is a trademark of Datameer, Inc. Hadoop and
the Hadoop elephant logo are trademarks of the Apache Software Foundation. Other names may be
trademarks of their respective owners.
SAN FRANCISCO
NEW YORK
HALLE
1550 Bryant Street, Suite 490
9 East 19th Street, 5th floor
Datameer GmbH
San Francisco, CA 94103 USA
New York, NY 10003 USA
Große Ulrichstraße 7 – 9
Tel: +1 415 817 9558
Tel: +1 646 586 5526
06108 Halle (Saale), Germany
Fax: +1 415 814 1243
Tel: +49 345 2795030