OVERCOMING BARRIERS TO DATA VIRTUALIZATION ADOPTION A Cross-Talk Research Paper Lindy Ryan, Research Director Research Research This document contains Confidential, Proprietary and Trade Secret Information (“Confidential Information”) of Radiant Advisors. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or any means electronic or mechanical, including photocopying and recording for any purpose other than the purchaser’s personal use without the written permission of Radiant Advisors. While every attempt has been made to ensure that the information in this document is accurate and complete, some typographical errors or technical inaccuracies may exist. Radiant Advisors does not accept responsibility for any kind of loss resulting from the use of information contained in this document. The information contained in this document is subject to change without notice. All brands and their products are trademarks or registered trademarks of their respective holders and should be noted as such. This edition published September 2014. Overcoming Barriers to Data Virtualization Adoption Research Table of Contents Introduction 1 Cross-Talk Research Methodology 2 Barrier #1: Choosing Data Virtualization 3 Barrier #2: identifying the Lynchpin Use Case 6 Barrier #3: Assigning Roles and Responsibilities 9 Barrier #4: Selecting a Vendor 11 Conclusion 14 Overcoming Barriers to Data Virtualization Adoption Research Introduction The challenges of data management are getting exponentially harder. With the ever-increasing quantities, sources, and structures of data – as well as the insurgence of new tools and techniques for storing, analyzing, and deriving deeper insights from this information – data-driven companies continue to evaluate and explore data management technologies that better integrate, consolidate, and unify data in a way that offers tangible business value. Data virtualization is so compelling because it addresses business demands for data unification and supports high iteration and fast response times, all while enabling self-service user access and data navigability. However, adopting data virtualization is not without its set of barriers. Primarily, these relate to building a business case that can articulate the value of data virtualization in terms of speed of integration alongside the ability to manage ever-growing amounts of data in a timely, cost-efficient way. Supported by Cisco Data Virtualization, Radiant Advisors had the opportunity to further explore and understand the barriers experienced by companies considering data virtualization adoption, and then to pose these questions to companies that have already adopted data virtualization to glean their insights, best practices, and lessons learned. Together, the two halves of this research facilitate a practicable, independent, and unscripted “cross-talk” to fill information gaps and assist companies in overcoming barriers to data virtualization adoption. Overcoming Barriers to Data Virtualization Adoption 1 Research Cross-Talk Research Methodology The first portion of the research independently solicited the participation of many companies within the Radiant Advisory Network to properly harness a wide perspective across many industries and organizations to identify common concerns and questions regarding barriers to data virtualization from a broad, preadopter community. The companies included were very interested in adopting data virtualization and either currently not using data virtualization or evaluating the technology for potential adoption. Gathering this perspective ranged from structured discussions at industry events to formal interviews with BI team leads. As a result, we identified areas of concern and isolated critical questions in need of response from the post-adoption community. These top-of-mind concerns were then posited to companies that have already adopted and implemented data virtualization within their organization. To facilitate this exchange of information, current data virtualization-successful customers were solicited from Cisco’s existing customer network. Representatives from each company were invited to independently and anonymously share insights, best practices, and lessons learned in overcoming barriers to data virtualization adoption. Following the completion of all interviews, detailed research notes from both consorts of companies were synthesized with Radiant Advisors’ expertise and thought-leadership to distill key insights that filled the information gaps identified by pre-adopter companies. Ultimately, this research and the following report are intended to guide potential adopters to overcome perceived barriers to data virtualization adoption by facilitating an independent, unscripted “cross-talk” between potential adopters and those who have already moved forward. Overcoming Barriers to Data Virtualization Adoption 2 Research Barrier #1: Choosing Data Virtualization With so many available technologies and approaches to data management, the first question that potential adopters consider is the rationale for the decision to adopt data virtualization opposed to other approaches (i.e. data federation, master data management (or, MDM), the data lake, extract-transform-load (or, ETL), etc.). Some companies have expressed that MDM provides a more manageable and intuitive approach, rather than consolidating multiple copies of data. Others wonder if data virtualization will eventually take a back burner in data management as the concept of the data lake grows in popularity and acceptance. Finally, there is lingering ambiguity about how much data virtualization is needed for batch analytics, real-time insight, and interactive queries – and if data virtualization can “When you move really fast, become oversaturated with too much data from the same source. customers are really happy Then, after deciding on data virtualization has been nominated as the preferred – every bit of slowness directly impacts customer approach, the next “biggest hurdle” has been expressed as how to effectively communicate what data virtualization is – both as a data management construct and as a value proposition – to C-level executives who are critical for buy-in and championship at the business level. It has been noted by IT teams that business happiness.” executives tend to “get lost in the sauce” and dwell on trying to navigate the differences between different data integration technologies, and that distinctions - A commercial, web-based aren’t meaningful at the C-level. Frankly, this makes the ability to gain executive services company buy-in and support difficult. Companies evaluating data virtualization would like to better understand how to take data virtualization “from a concept to a framework” in a way that is communicable and meaningful to executive stakeholders. Overcoming the Barrier Current data virtualization users note that sometimes you can’t be traditional – you have to do something different. That said, the value proposition is simple: data virtualization avoids longer data integration development lifecycles that are riskier to meet fast paced business needs, is faster to market and deploy, and supports the changes that happen on-the-fly in the organization. Companies that use data virtualization have described it as a seamless and quick way to “insulate and adapt,” by using an abstraction layer to “snap together” data, Overcoming Barriers to Data Virtualization Adoption 3 Research Barrier #1: Choosing Data Virtualization change views, present the same data from multiple places, and change the logic behind it. The reality is that (business) users want unfettered access to unaltered data, and they want it fast. Using data virtualization avoids depleting alreadyscarce IT time to build a model that ends up not meeting expectations – inevitably frustrating business – and instead offers a quick scanning tool to build in a way that responds immediately to business changes and needs. Data virtualization has, too, been seen as a way to expand IT’s potential – described as a “capacity increasing initiative,” if you will. As a layer of abstraction, data virtualization lives between data sources to establish unified semantic context and data access without actually impacting the data source structure. It provides a mechanism to bring IT closer to the business, too, moving away from a very techheavy and IT-driven function to one that can be translated into business language and moved closer to the business user for richer enterprise-wide collaboration. Finally, data virtualization has been implemented as a vehicle to polyglot persistence (persisting data in the data store best suited for its purposes). With the influx of new databases, the ability to store data in more performant ways than traditional databases becomes increasingly more imperative. A move to an abstraction layer has proven both empowering and agile to understand what data exists where and what the access patterns are around that data, as well as enables the ingestion of more on-premise and cloud data. Ultimately, from an executive management perspective, the value of data virtualization aligns with the ability to provide enterprise data back out for analytics and visualization, and moves away from the use of “black box” enterprise systems that can make it difficult to acquire data in real-time. One interesting note is that the reasons for choosing data virtualization as a core technology and the reasons for keeping it are not always one in the same. For example, one Colorado-based non-profit explained that they choose data virtualization to provide a layer of abstraction between data stores and users of those stores with the idea that IT would be easily able to switch out the back end with no changes required to the front end. However, while the use case remains valid, at this particular organization it turned out to be an internal fallacy as the Overcoming Barriers to Data Virtualization Adoption 4 Research Barrier #1: Choosing Data Virtualization company’s user groups all had power users who demanded access to unaltered source system data and resisted the approach. In the end, this company chose to keep data virtualization while other investments were canned (including MDM, ETL, and rapid DW development) because of its success in using data virtualization to enable both logical data warehousing and real-time data provisioning for services. Overcoming Barriers to Data Virtualization Adoption 5 Research Barrier #2: Identifying the Lynchpin Use Case Potential data virtualization adopters are eager to understand (and learn from) the experiences of post-adopters regarding how to “get started” using a data virtualization layer – including what elements informed that decision, the resulting lessons learned, and how that initial experience has been applied to an overall data virtualization growth strategy. “You can’t rebuild data every day, so you need the ability With so many possible “lynchpin” use cases, many potential data virtualization adopters have expressed concern that data virtualization can seem “too to get what you want on the conceptual” of a concept to hone in one a single opportunity that simultaneously fly. In a [traditional] data justification opportunity. Specific questions from potential adopters asked for warehouse you build all or takes advantage of agility, is speedy to start, and illustrates a real business value further exploration into whether there have even been use cases that have been proven to be impossible to do without data virtualization – as opposed to more you build nothing because difficult – and if there were any specific implementation challenges (or “gotchas”) the data is persisted there – were addressed. but with data virtualization that were uncovered during a data virtualization implementation, and how these Moving forward, once the initial hurdle of launching data virtualization has been you can build and rebuild on overcome, potential adopter thoughts turn to how data virtualization is expanded the fly with data as needed.” phased adoption and implementation. This path of thought also includes which within the enterprise as part of a larger growth strategy and seek insights to guide implementation strategies to avoid with data virtualization, to ensure that it is - A cable telecommunications deployed efficiently, effectively, and properly throughout the organization. company Overcoming the Barrier Post-adoption companies validate the importance of the proper lynchpin use case to prove the value of data virtualization to the organization and earn executive buy-in. The following represent a sampling of data virtualization lynchpin use cases that successfully introduced data virtualization to the organization: • One media company had invested nearly $500k in the development of a physical customer data mart that was, in the end, unable to be built according to established Overcoming Barriers to Data Virtualization Adoption 6 Research Barrier #2: Identifying the Lynchpin Use Case timeline parameters and money allocated. Because the business had already invested significant resources in the project, IT was unable to ask for additional funding and, instead, had to deliver on the requirements document without additional cost or time. With data virtualization, they were able to build virtually and deploy within three months with only supplemental professional services assistance. For this company, their lynchpin case was initiated as an opportunity for creative problem solving. • A commercial, web-based services company approached data virtualization adoption as a way to overcoming a broken data environment. Previously at this company, some databases weren’t well designed and were “bulled together” with brute force, without consideration for architecture and business rules. Because of this, additional BI tools couldn’t be added on top of the existing infrastructure and technologies already in place couldn’t be used as intended. The initial hurdle was trying to run poorly performing queries with handwritten SQL code, and initiatives in ETL helped to properly structure for analytics but still left other operational and real-time data unable to be seamlessly interlinked. Then, when the company’s contact management system began a redesign of ownership hierarchies that rendered hundreds of reports obsolete, IT was left with no time to rebuild a fragmented data warehouse. Instead, they turned to data virtualization to insulate and adapt the business as the data environment changed from underneath too fast to approach in traditional ways. • Another multinational agrochemical and biotechnology corporation tucked data virtualization adoption discreetly within a $1 million business transformation project aimed at enabling a “new start” in its business information platform. As the project rolled out, data virtualization was simultaneously deployed across the organization. The reasoning was straightforward: within any organization there is a relatively small set of data (say, 15% by this company’s estimate) that is used over and over again. This company wanted to take this highly-reusable data and make it self-serving, thereby partnering business and IT to collaborate directly to address the business’ desire to see their own data in their own language, vet definitions, and then translate that into myriad data virtualization-enabled reporting tools as a single source of extraction and value. Overcoming Barriers to Data Virtualization Adoption 7 Research Barrier #2: Identifying the Lynchpin Use Case • Lastly, a children’s non-profit organization used data virtualization to spur a project that had been languishing for over a year to due the fact that modeling it in a traditional EDW process (staging, ETL, business model data warehousing, data view – “or worse yet, a cube” --, report) had proven far too cumbersome and interdependent. After hearing other industry success stores on data virtualization, this company was emboldened to dust off the old project, and set about to retiring its traditional data warehouse. This was replaced with a logical data warehouses that could meet organizational real-time data needs, however even this approach was still unfamiliar territory fraught with struggle. Management stepped in and helped foster organizational agreement on key terms and definitions. Ultimately, the inclusion of a data virtualization layer allowed this company to provide both operational and analytical reporting out of the same set of views, maximizing the company’s operations and business value to its charitable endeavors. Overcoming Barriers to Data Virtualization Adoption 8 Research Barrier #3: Assigning Roles and Responsibilities As potential adopters begin to roadmap data virtualization as an integral part of data and organizational architectures, many see the identification and designation of resources and roles within the BI team as an obstacle due to their inherently competitive natures (i.e., who has the “rights” to work within the data virtualization tool and is this person a data modeler or an ETL developer? Moreover, how do they interact with the DBA?). There is the perception that creating a data virtualizationspecific team drives agility by consolidating work in a tight area with limited “Even with those basic (and resources, however there is a lack of clarity as to whether these roles should be isolated to IT, or whether they should be isolated within the business, as an IT significant) flaws, we saw tremendous success and acceptance [from using brokerage team, or as part of a cross-functional team. Separately from ownership, under the helm of roles and responsibilities, there is additional concern regarding the handling of security and governance within a data virtualization environment. With the need to drive access and visibility for data virtualization] and soon started to suffer from discovery, many companies wonder how to enable a governed data discovery environment that leverages data virtualization while still controlling individual access and database security. the semi-mythological Overcoming the Barrier ‘perils of success.” Adopted companies consistently noted that they retain small, interdisciplinary, - A non-profit organization and agile teams to fully deploy and manage data virtualization. Many companies noted a team of between five to seven people that is centralized and consists of both business and IT roles (two to three IT FTE with others part of the organizational matrix). One investment-banking firm noted that responsibility shared among multiple departments has enabled the capacity to virtually grow in a wellgoverned consistent fashion across the organization. Another non-profit noted that these roles should, to some extent, be overlapping with each member having some capacity to assist in other areas while maintaining specializations that align to the core functions of the team. Overcoming Barriers to Data Virtualization Adoption 9 Research Barrier #3: Assigning Roles and Responsibilities Specific core roles and responsibility main responsibility is to ensure the server remains available and accessible with no downtime to correctly Application Administer meet incoming demands, honor business service-level agreements, and anticipate future demands Governance Architect Business Information Owner Developer takes ownership of decide what should or should not be built as virtual objects – and then, how to build it – within the abstracted environment to sustain a sharable, consistent model that provides proper business semantics across the organization work in tandems with the governance architect to avoid a purely technical approach and retain the business value of data virtualization acts as the single point of authority to vet and implement proposed changes within the data virtualization environment as well as builds, optimizes, and tunes the virtual objects Most data virtualization teams also have a non-technical data virtualization Champion that actively communicates with executive management to champion data virtualization’s value and inclusion as a core technology within other areas of the business. As a measuring point for data virtualization adoption within the organization, some companies have also launched an Integration Competency Center, too, to leverage the small data virtualization team for development and training across the larger organization. Communication, community, and awareness are essential, especially in areas of governing the abstraction layer itself, which remains a moving target as both the role of data virtualization and the business process of discovery continue to mature in the organization. Overcoming Barriers to Data Virtualization Adoption 10 Research Barrier #4: Selecting a Vendor Finally, potential data virtualization adopters seek to cut through the marketing of vendors and understand key differentiators between data virtualization vendors in the space today. Additionally, potential adopters hope to navigate beyond the vendor marketplace and benefit from lessons learned and best practices of adopters that have already had the opportunity to evaluate competing vendors and have had the opportunity to learn from hands-on, already in-use experience with chosen tools. Further, they would like to insight on how the evaluation and selection of a data virtualization vendor applies relative to the total tool and vendor “We went with the landscape and technology architecture, how it has performed since its adoption, and how this affects the ongoing roadmap of the organization. [data virtualization] As a vendor differentiator, beyond product capabilities has been a distinct interest approach because the environment was changing out from underneath way in how vendors support ongoing training and learning events for users, including the performance and responsiveness of vendor technical support. Additionally, because potential adopters recognize that there will be a learning curve inherent in adopting a new technology, they are interested in the level of training and support services available, what has proven to be the most useful and invaluable too fast to approach in traditional ways.” - A commercial, web-based services company for companies – and its technical and non-technical users – that are new to data virtualization. Overcoming the Barrier Key differentiators when evaluating data virtualization vendors, according to those companies who have already undertaken this journey and made a buying decision, can be distilled into three areas: vendor/product maturity, industry impact, and roadmap; product technical capabilities; and level of vendor support and services. Vendor/Product Maturity and Impact Across Industries Time and time again, adopting customers remarked the importance of seeing a vendor client-base that resonates with their respective industries as a measuring stick by which to weigh the vendor’s strength and permeation amongst its competitors. Another indicator of vendor/product maturity is its time in the market Overcoming Barriers to Data Virtualization Adoption 11 Research Barrier #4: Selecting a Vendor and longevity with proven customer use cases. Further, adopting customers looked for synergies in use cases that showcase product maturity and illustrate how vendors have handled challenges trending in the market. Ongoing transparency in where the vendor has been and where they are going proves just as important to adopting customers, noting that the vendor roadmap is critically important to the ongoing success of data virtualization. Adopting customers also desired a vendor that allows users a voice into the product development roadmap, and one that actively looks to incorporate emerging technologies (such as the Cloud) to expand and refine existing capabilities. Product Technical Capabilities Of course, how the technology works and what it is capable of is a huge differentiator during data virtualization vendor comparisons. In addition to the distilled list of key differentiators below, price and total cost of ownership is, of course, always an important factor in the buying decision. • Ease of use (User interface; allows development using SQL) • Availability of source system connectors (Databases, Web APIs, RESTful APIs) • Ease of ingesting new types of data sources • Ease of accessing data output (ODBC, JDBC) • Support for complex caching needs • Availability of clustering for redundancy • Ease of deploying changes to production environment • Visibility into execution plans • Scalability (a big concern in large enterprises -- DV strategic architecture) • Extensibility (ease of calling into and out of the DVL) • Impact analysis, or ease of identifying lineage both backward and forward for any entity or attribute at any level (i.e. sources, source abstraction layer, middle/conforming layers, demand layer, and publications) • Performance monitoring • Object level security • Optimization engine -- ability to create high performance execution plans • Pass-down optimizations • Audit logs for reviewing user access for governance and compliance Overcoming Barriers to Data Virtualization Adoption 12 Research Barrier #4: Selecting a Vendor Level of Vendor Support and Services Lastly, adopting customers chose a vendor that not only has proven successful and has the right capabilities and roadmap to help customers continue to be successful, but for one that takes the initiative to help their customers actively succeed. This is measured in terms of both responsive technical support in helping to investigate and resolve issues, as well as the level of vendor training provided to the customers that minimize the learning curve and helped to leverage existing internal knowledge. Finally, customers noted that professional services offered by the vendor would be helpful as they work to develop the right data virtualization environment, particularly at the onset of implementation. This is primarily because customers noted that they still struggle with what goes into a classic layer and what goes into a data virtualization layer, and how these complement each other. One customer called this balance between access and ownership a “political turf war” and said that governed data virtualization requires a large cultural change, as it is difficult to get true ownership, especially pertaining to semantic definitions. Overcoming Barriers to Data Virtualization Adoption 13 Research Conclusion Data virtualization is a mature technology that has evolved with the data management industry. However, while nearly every major research and consulting firm delivers architectural blueprints that include a logical business semantic layer, only some companies have implemented that component -- and even fewer have enterprise-wide – leaving data virtualization the often-missed piece of an overall unification strategy that includes incumbent, traditional ETL and data replication. As a technology, data virtualization in itself isn’t a silver bullet, but when implemented with governance and a rich methodology, it solves the bigger problem of managing data volatility and addresses many pain points of business and IT. The reality today is that, without a robust consolidation strategy, business and IT simply cannot keep up with the volatile landscape of surging data. More and more, integration is becoming a lackluster blanket integration strategy: even if you manage to do it, it invariably won’t last. Instead, a data unification strategy that adopts data abstraction through data virtualization as a key integration strategy solves data volatility and offers very immediate benefits to alleviate the pain points of the organization, and maximizes the value of integrations whether physical or abstracted. To achieve the benefits of data virtualization, companies need to take the leap. So, while barriers exist, they can be easily overcome -- many have done so already and are achieving these benefits today. This research is a guidebook, providing a living conversation from both sides of the adoption table, and intended to assist potential adopters in overcoming barriers by leveraging the insights and lessons learned of those who have already moved forward. After overcoming barriers to data virtualization adoption, the next step is to plan this technology as part of a long-term data integration strategy. Therefore, the same discipline and diligence that is put into physical data integration should be applied into the abstracted environment. Seek to quantify the reduction in both development time and time to deployment leveraging data virtualization, and, once these are quantified, double the benefit quotient by moving beyond the immediate, localized benefit and looking farther to drive larger, enterprise-wide strategic benefits. Overcoming Barriers to Data Virtualization Adoption 16 About the Author Lindy Ryan, Research Director, Data Discovery and Visualization, Radiant Advisors As Research Director for Radiant Advisors’ Data Discovery and Visualization practice, Lindy leads research and analyst activities in the confluence of data discovery, visualization, and data science from a business needs perspective. Sponsored by: Cisco (NASDAQ: CSCO) is the worldwide leader in IT that helps companies seize the opportunities of tomorrow by proving that amazing things can happen when you connect the previously unconnected. Cisco Information Server is agile data virtualization software that makes it easy for companies to access business data across the network as if it were in a single place. To learn more about Cisco Data Virtualization visit www.cisco.com/go/datavirtualization About Radiant Advisors Radiant Advisors is a leading strategic research and advisory firm that delivers innovative, cutting-edge research and thought-leadership to transform today’s organizations into tomorrow’s data-driven industry leaders. Boulder, CO USA Email: info@radiantadvisors.com To learn more, visit www.radiantadvisors.com © 2014 Radiant Advisors. All Rights Reserved.