The New Normal in Business Intelligence The new normal in business intelligence • The new normal in business intelligence is about the transformational changes that is taking place in the digital world and definitely change in the nature of business intelligence. The New Normal in BI • The Internet is the societal operating system of the 21st century and its underlying infrastructure – the cloud computing model – represents a disruptive change. • A networked infrastructure, big data from disparate sources and social media, the selfservice model, and collaboration are changing the way BI systems are deployed and used. 2 In today’s market place change is constant • Products are increasingly commoditised, development cycles have shortened and expectations of consumers are rising. To achieve a sustainable competitive position, companies must react in an agile way to changing market conditions. • The current business environment evolves around a transition towards globalization and a restructuration of the economic order. The pace of technological changes that allow instant connectivity and the current era of ubiquitous computing that resulted from it, represent the new normal in business intelligence. 3 As an industry business intelligence has to adapt to environmental changes • The evolution of the Internet as a new societal operating system, reshapes the future of business intelligence. • The Internet evolves as a platform for the use of interoperable resources (storage, computing, applications and services) and drives the development of information intensive services in the 21st century. Increasingly, the cloud becomes the vehicle for the Internet of Services. • The business ecosystem generates a huge amount of data in terms of volume, variety and velocity, and requires businesses to take on a data-driven approach to differentiate. It’s about gaining actionable insights faster than the competition by reducing the data-to-decision gap. • This highlights the integration of structured and unstructured data especially, social media content to derive actionable insights from big data and the leverage of predictive analytics for agile decisionmaking. 4 Gradually BI will be more than an IT function, it will be about people and their business decisions • The exponential growth of data and the increased reliance on insights derived from data for decisionmaking, causes a shift in the focus of business intelligence. • Therefore, the emphasis of next-generation BI will be on designing solutions that focus on answering business questions of the end user. In the field of BI the finished product is not a dashboard displaying metrics but actionable intelligence answering the business question at hand. Users will want seamless access to information to support decision-making in their day-to- day activities. • The future direction of BI will thereby be shaped by the new age of computing. In both their personal and professional lives, Web-savvy users have adopted the principles of interactive computing and have come to demand customizable BI-tools with high responsiveness. . 5 Gradually BI will be more than an IT function, it will be about people and their business decisions • Business intelligence, and the insights it delivers, evolves towards an enterprise service that follows the lines of a self-service model with business users producing their own reports in an interactive way and performing analytics on demand. • Furthermore, Web 2.0 and social networks function as catalysts for highly intuitive user interfaces and the collaborative features of computing allow users to share insights, which transforms BI from a solitary to a collaborative activity. • Companies are exploring the connection between analytical activity and knowledge sharing. Combined with collaborative technologies that crowdsource intelligence from various partners of the extended enterprise, this approach provides the context for better and faster decision-making. 6 The factors that constitute the new normal in BI can be summarised as follows : The Future Internet Prescriptive Analytics Social Media Analytics Big Data The New Normal in BI Collaborative BI Cloud Computing Embedded BI User Empowerment / Self-Service BI 7 1 The Future Internet • The main objective of enterprise computing is to be adaptive to change. • The new generation of enterprise computing must enable pervasive BI deployments : – W eb-based technologies enable the implementation of userconfigurable BI applications connecting to a wide arrangement of data BI-applications are delivered as a service on the W eb or hosted in the cloud – INTERNETENABLED ITINFRASTRUCTURE • spreading BI to more users and more devices : • consumerization of IT : enterprise computing aligns with consumer-class technologies ; • BI-tools are more and more organized around the user’s experience to interactively discover hidden relationships, trends and patterns and to create new information and relate it with external data sources ; using multiple data sources : the use of structured as well as semi- and unstructured data sources (e.g. social media content) extends the playing field of BI. The new generation of enterprise computing needs to be developed within the perspective of the future Internet : – the Internet as data source : • – BI applications no longer limit their analysis to data inside the company and increasingly source their data from the Internet to provide richer insights into the dynamics of today’sbusiness ; the Internet as software platform : • BI applications are moving from company-internal systems to service-based platforms on the Internet. 8 Drivers of networked infrastructure Workforce Demographic Shifts Globalization Cloud Computing Bandwidth & Connectivity 9 The Future Internet • Business Networks The Future Internet • Business Networks The Future Internet The Internet of the future gives rise to a new business model that allows enterprises to form business networks : – in the knowledge economy economic activity is based on highly networked interactions ; – the amount of digital collaboration is increasing among people, things and their interactions (through the Internet of People and theInternet of Things, networking is expanding not only in person-to-person interactions, but also in person-to-machine and machine-to-machine interactions). Business networks take on a data-driven approach to differentiate and apply fact- based decision-making enabled by advanced analytics: – economic interactions are based on the principle of scarcity and in the knowledge economy the concept of scarcity applies to information ; – information in itself does not createcompetitive advantage (access to lots of information has already become ubiquitous) ; competitive advantage is defined as access to information, the decisions based on that information and the actions taken on these decisions ; – business networks manage data in real-time, support anywhere, anytime and any device connectivity and provide the appropriate information to users across and beyond the enterprise (business users, partners, suppliers, customers). 10 The Future Internet Business Networks The Future Internet • The Internet serves as a platform for a service-oriented approach that changes the way of enterprise computing. With BIapplications moving to the web, the Internet emerges as a global SOA that is referred to as an Internet of Services. The IoS serves as the basis for business networks. • The new BI requires technologies that integrate multiple data sources, address business needs in a dynamic way and have a short time to deployment. • Contrary to large scale application development of traditional BI, the new BI moves towards smaller and flexible applications that can adopt quickly and are supported by a service-oriented architecture. • Internet of Services and BI • • • People – User empowerment / Self-service Users expect to have access to business information in the same way as they use the Internet and search the Web. Self-service BI is the implementation of this service- orientation at the end-user level. Process – Embedded BI BI moves into the context of business processes and transforms from a reactive to a proactive decision- making tool by monitoring performance and the prediction of future events. This change in the use and delivery of software is guided by the adoption of a service-oriented approach. Technology – Cloud Computing Cloud computing emerges as a new deployment model of BI by the adoption of a service-oriented architecture and drives a transformation in application architectures through using “the Web as a platform” for interoperable applications and services. 11 2 Big Data BIG DATA Data that is TOO LARGE & TOO COMPLEX for conventional data tools The 4V’s of Big Data VARIETY VERACITY VELOCITY Data generated in one flight from NY to London: 7 Billion 10 Terabytes Number of tweets per day on Twitter: Number of ‘Likes’ each minute on Facebook: 500 Million to capture, store and analyze. VOLUME Shares traded on US Stock Markets each day: VALUE 90 % 4 Million OF THE WORLD’S DATA WAS GENERATED IN THE LAST TWO YEARS 12 The evolution of internet and the proliferation of data Data 3V Major source of big data The Internet The Web The Cloud Semantic Web Social Web Static Web Desktop/PC era Internet of People producer generated content user generated content. Internet of People and Things system generated content 13 time Heterogenous datasets are no longer manageable by a traditional relational database approach • As connectivity reaches more and more devices, the volume, variety and velocity of data from clickstreams, social networks and the Internet of Things (through which the physical world itself becomes an information system) creates a new economy of data. • Velocity Variety Analytics Volume • Database Technology Services Traditionally, BI applications allow users to acquire knowledge from companyinternal data through various technologies (data warehousing, OLAP, data mining). However, the typical pattern of cleaning and normalizing proprietary information through an ETL process into a data warehouse is challenged by the transition to big data that is marked by greater accessibility, interoperability and 3rd party leverage of online data. For businesses to become responsive to market conditions, it is necessary to look at the whole ecosystem by connecting internal business data with external information systems. BI- applications must access data from disparate sources inside and outside the firewall, consider qualitative and quantitative data and include structured as well as semi-structured and unstructured data. 14 Traditional RDBMS and SQL-based access languages are unfit to the new world of unstructured information types • Data from the Web is feeding BI applications : – – • BI applications move to the Web : – – • BI applications no longer limit their analysis to data inside the company, but also source data from the outside, especially data from the Web. The Web is a data repository. An important challenge is the extraction, integration and analysis from hererogeneous data sources. BI applications are increasingly accessible over the Web : BI is consumed as a service from thecloud. The challenge here is the development of Web-based applications that access and analyze bothhistorical enterprise data and real-time data, especially from the world wide market and making the information available on a variety of devices. Requirements for next-generation BI-tools include: – – – – connect directly to the underlying data sources to capture distributed data ; schema-free : relationships between data are discovered dynamically; anytime, anywhere access with multiple devices ; real-time visibility of what is happening now is needed and analytics must be used in the stream of business operations. 15 BI has evolved from historical reporting to the pervasive analysis of real-time data from multiple data sources • Transactional data is analyzed in combination with new data types from social, machine to machine and mobile sources e.g. sentiment, RFID, geolocation data. • Organizations that embrace a « socialization of data »approach by incorporating and converging disparate data sources into their BI-platforms, acquire a holistic view that provides them with the opportunity to derive actionable insights, e.g. – – – • analytics of real-time customer sentiment and behaviour yield indicators of product or service issues; geospacial information of customers can be combined with transactional data to make targeted product or service offerings ; combining internally generated data with publicly available information can reveal previously unknown correlations. In its focus on the user experience, BI embraces Web 2.0technology that focusses on intuitive user interfaces. Organizations must master visualization tools that let business users interactively manipulate data to find tailored insights that can be shared with other stakeholders (customers, partners, suppliers). 16 Mashups, that is, combining data from different sources into an integrated application will be the order of the day Web services are an important tool for data integration from multiple sources and provide access to real-time information that can be fed into operational applications. Open access makes BI-functionality accessible across and beyond the enterprise. Web services are user-centric because information is provided in the context of day-to-day activities. 17 Cloud computing # apps / # users 3 networking office automation data warehousing eCommerce service-oriented architecture Web 2 virtualized connected environment Internet-based data access & exchange « as a service »paradigm desktop computing centralized automation 1970s 1980s 1990s 2000s 2010 & beyond “Cloud computing is enabling the consumption of IT as a service. Couple this with the “big data” phenomenon, and organizations increasingly will be motivated to consume IT as an external service versus internal infrastructure investments”. The Digital Universe Study : Extracting Value from Chaos, IDC, 18 Cloud computing is the backbone for the Internet of Services and provides resources for on demand, networked access to services ER P Infrastructure as a service Platform as a service Software as a service Data as a service Analytics as a service • Cloud computing alters the way computing, storage and networking resources are allocated. Through virtualization, the traditional servercentric architecture model in which applications are tied to the underlying hardware is altered to a service-centered cloud architecture. • Applications are decoupled from the physical resource which implies that services -computing resources, e.g. processing power, memory, storage, network bandwidth in a cloud computing environment are dynamically allocated to on demand requests. • In addition to a better utlization of IT resources, hardware cost reduction and greener computing, cloud computing provides an agile infrastructure to respond to business needs in a flexible way. 19 Cloud computing leading to commoditization of analytics The trend towards the hosting of services, leads to the commoditization of analytics. As a result, the creation of a competitive advantage depends on 2 factors. The management of large data volumes (data integration, data quality). As data fuels analytic processes, big data becomes increasingly important. Analytics in itself don’t guarantee a competitive advantage. The insights, communications and decisions that follow analysis become more important. This stresses the role of self-service and collaboration. 20 Cloud computing provides an agile infrastructure to respond to business needs in a flexible way Cloud computing and big data In the pre-cloud world, the implementation of data warehouses needed serious upfront costs and designing database schemas was time consuming. Moreover, database schemas have their limitations because some data types (e.g. unstructured) don’t fit the schema. Combined with the need to manage big data volumes new database technologies (e.g. NoSQL) are used. For example, in the case of a Hadoop cluster that runs in parallel on smaller data sets, multiple servers are needed. Making use of cloud computing services in a pay-for-use formula is appealing. Furthermore, a service-oriented cloud architecture is ideally suited to integrate data from various sources (e.g. « mash up » enterprise data with public data). Cloud computing and selfservice BI Cloud computing gives a new meaning to the consumerization of IT. The convergence of cloud computing and connectivity is changing the way technology is delivered and information is consumed. Cloud applications are available on demand and developed to meet the immediate needs of users. Cloud computing is an important catalyst for selfservice BI. Users do not need to be concerned with the technical details of software and hardware when using services. User-friendly interfaces and visualization capabilities make the generation, sharing and acting on information in real-time easier. This permits faster and better decision-making as well as greater collaboration internally and outside the firewall. 21 4 Embedded BI The Need for Agile BI As the market changes faster and faster, BI has to adopt to support decisions inday-to-day operations. The role of BI has changed beyond its original purpose of supporting ad hoc queries and analysis of historical information. With changing market dynamics there is a growing need to monitor performance using the latest data available and topredict future events. Process Orientation The new BI delivers information to users within the context of operational activities. Rather than reporting on the business, BI moves into the context of business processes. Data is analyzed in the flow of transactions to produce real-time metrics, alerts, recommendations and predictions for action. BI transforms from a reactive to a proactive decision-making tool. EMBEDDED BI Operational BI is related to the subject of real-time processing. Through the Internet of people (e.g. social media) and the Internet of Things (e.g. RFID and other sensored data), information becomes available that helps enterprises to improve business processes. 22 BI will take the new direction of adding features normally associated with BI software to existing applications from monolithic applications to service-oriented architecture 23 Next-generation business applications will have the computing power to proactively generate information that supports operational decisions PROCESS PEOPLE Next-generation applications are not static but interactive, allowing users to couple the right actions based on the insights that are delivered. Self-directed analytics give users the ability to navigate through and visualize business data, allowing them to generate views and reports relevant to their job function. Business Analytics For example : - analytics on browser-based BI applications allow the mobile workforce to take actions ; - in an inventory application, proactive decision-making is supported through real-time information about which items are running low in inventory. TECHNOLOGY New approaches such as in-memory processing, in-database analytics, CEP, etc. contribute to the broader adoption of BI. 24 The consumerization of IT and the need of business decisions to be made on relevant information are drivers for placing reporting and analytics in the hands of more decisionmakers and to apply analytics in real-time to production data. 1 3 2 1 2 3 • changes in the function of applications : from dedicated applications to composite applications changes in the way data is accessed : from data as an isolated resource to data as a service A broader user adoption of BI results from : – – – – • changes in the nature of BI : from stand-alone applications to embedded applications faster and easier executive access to information ; self-service access to data sources ; right-time data for users’ roles in operations ; more frequently updated information for all users. The business benefits are : – – – – improved customer sales, service and support ; more efficiency and coordination in operations and business processes ; faster deployment of analytical applications and services ; customer self-service benefits. 25 5 User-empowerment / Self-service The New BI Traditional BI client server, closed, proprietary architecture structured data (data gathering depends on data warehousing methodology) data analytics and presentation are separated ; data-centric analytics create data models, control of data and applications focused on standard reports ; predefinied reports to answer predefined questions on premise, desktop and server IT role BI-delivery deployment type deliver relevant data, ensure security and scalability, enable self-service focused on interactive analysis by end-users ; used to derive new insights (“business discovery”) on premise and on demand (cloud, SaaS) 26 A confluence of factors* is driving a trend called the consumerization of IT technological innovations are userdriven and increasingly outside central IT-control • Enterprise application development is driven by the need for interactive access to disparate data, self-service capabilities that offer a flexibility for personalization and end-user customization. BI shifts towards the selfservice delivery model that accomodates knowledge workers to search, access and analyze data from a variety of sources and available on a range of devices. • Empowerment of users is an important trend in BI. Business users generate their own reports and analysis and are no longer dependent on IT to deliver them. The ownership of BI shifts from IT to the business. • By incorporating collaborative features, BI environments are getting social. These enhancements facilitate the creation of user-generated content that can be shared with stakeholders across and beyond corporate boundaries, enabling the networked enterprise and optimized decision-making. self-directed analytics business discovery long tail solutions reusability Consumerization of IT Traditional IT Infrastructure data governance security * including ubiquitous broadband, a growing technology-native workforce, the adoption of social networking tools, mobile apps, etc. 27 Drivers of the consumerization of IT User-generated content BI as a service The cloud as a delivery mechanism for self-service BI. CoIT Crowdsourcing. Architecture of participation. UBIQUITOUS CONNECTIVITY Power shift from expert-generated to user-generated content. Because markets are more volatile, businesses seek greater agility to respond faster to market requirements. The democratizaton of BI is driven bottomup and top-down. Users want customized tools, while the ability to mine data is critical for business competitiveness, which causes informed decision-making to be extended across more roles. Big data. The googlization of BI. Data and desktop virtualization Accessing data and applications from any location, on any device, at any time. 61 The BI-landscape will be reshaped by the model of the consumer web intuitive user interfaces, easy to use, work from browser, real-time, zero wait, app-driven, multiple devices user-driven analysis, open standards, loosely coupled services culture of sharing and collaboration 29 The road forward …. self-service, fact-based decisions, agile BI Business users are empowered to gain insights into data (through exploration, visualization) Collaboration is more than distributing and sharing of documents ; it implies bringing context to analytics : different people track the relevancy of analytics and the decisions that will be based on it The result is faster and better decision-making Value created from data can be shared internally within the company and externally with customers and partners in memory data management interactive data visualization Web-based delivery 30 6 Collaborative BI Collaborative is the merging of business intelligence software with collaboration tools, including social and Web 2.0 technologies, to support improved data-driven decision making. • The idea of collaborative BI is to extend the processes of data organization, analysis and decision-making beyond company borders. • While Web 2.0-technologies are migrating into the enterprise, consumer-oriented social media tools do not provide the necessary components for collaborative BI. Collaborative BI requires the principle of information sharing to be incorporated into day-to-day workflows. • A difference also exists between analyzing social media on the one hand and collaborative BI on the other hand. Social media provide a new source of data that complements traditional data analysis to help organizations capture market trends, better understand customer attitudes and behaviour and uncover product sentiments. • Collaborative BI uses web-based standards to connect people (enterprise users, partners, suppliers, customers) to build dynamic networks that share information and analysis results to enable timely 31 decisions that drive actions. Why organizations adopt collaborative BI • Big data involves the analysis of ever-increasing volumes of structured and semi- or unstructured data. In the context of always changing business requirements, organizations need to act quickly and decisively on business and consumer trends derived from petabytes of data. • Closely related to the expectations of users to access applications anaywhere, at any time on any device are self- service features that allow them to interact with data in a flexible way. Accordingly, technologies as advanced data visualization, embedded BI and in-memory analysis rank high in preference lists. Collaborative BI correlates with the analysis of big data and self-service BI 32 The business value for collaborative BI can be situated from the eight core patterns of Web 2.0. 33 The pervasive use of BI simulated through these technologies is a necessity to enable analytic agility Web 2.0-features focus on the user experience. The customer-centric focus of Web 2.0 has created a demand for applications that move from the traditional transaction platform to a model that is more accessible and personal for the user. Web 2.0-applications represent an opportunity for BI to build Web-based collaboration. Reports can be published in blogs and wikis, which help construct a knowledge base to share interpretations. Users will learn to use information more dynamically which allows the generation of « crowd-sourced wisdom ». Besides reporting and analysis, decisions are part of the BI delivery mechanism. Gaining insights from data to drive better decisions is no longer constrained by the limits of internal data. The open access to information in the Web 2.0-space allows users to combine existing information with consumer- generated content from the social networking spectrum like blogs and wikis. Social media analytics presents a unique opportunity to threat the market as a « conversation » between consumers and businesses. Companies that harness the knowledge of social networks compile enterprise data with streams of real-time data from Web 2.0-sources to better access marketplace trends and customer needs. The adoption of Web 2.0-technologies and applications can help businesses to expand the reach of BI and improve its effectiveness. « The world is rapidly turning into a network society. … The need to quickly adapt to this changing environment is evident. The new paradigm in innovation is joining forces in an online environment and activily working together. If we collaborate, we can co-create and grow our ideas together, which ultimately leads to better, faster and higher value Innovation ». 34 7 Social media analytics Social media analytics is the practice of gathering data from social media websites and analyzing that data using social media analytics tools to make business decisions. The most common use of social media analytics is to mine customer sentiments to support marketing and customer service activities. • An important BI trend is the incorporation of the growing streams of data generated by social media networks in BI applications. • Social BI is a type of intelligence that focuses on data that is generated in real-time through Internet-powered connections between businesses and the public. • Social media analytics give companies insights into the mindset of their (prospective) customers, help them improve media campaigns and offerings and accelerate responses to shifts in the marketplace. 35 Drivers for social media analytics The objective of social media analytics is to analyze social media data in context and generate unique customer experiences across channels. 36 The spectrum of available data has been enlarged with new soures, esp. social media data streams. 37 The explosion of social media drives the need to analyze and get insights from customer conversations. 38 The mobile and social media explosion empowers customers and the customer experience takes on a new meaning interaction data descriptive data attitudinal data behavioral data 39 Examples of the use of social media analytics in day-to-day operations • Baynote (www.baynote.com) provides recommendation services for websites. Websites using Baynote recommendations deliver relevant products and personalized content that create an intuitive user experience. • Baynote applies « interest mining ». It attempts to cluster consumers to provide product or content recommendations that are based on a broader understanding of consumer behaviour. Baynote goes beyond the clickstream by examining the words associated with the clicks the user makes. Combining the clickstream and the semantic stream reveals the communality of cluster members above a pure statistical or demographic cluster approach. The resulting « integrest graph » is used to personalize product and content recommendations that lead to maximum engagement, conversion and lifetime value. • Wise Window (www.wisewindow.com) distills social media content automatically and in realtime into industry-specific taxonomies. The approach that Wise Window calls « Mass Opinion Business Intelligence » (MOBI) does not focus on individual behavior but the type of syndicated research that Wise Window performs is aimed at giving a broader understanding of consumer sentiments and behavior in the market at large. • MOBI discovers leading indicators with data derived from social media to make organizations more agile and responsive. Application fields include simple mindshare analysis, discovering new products and niches, spotting fast movers, performing constituent analysis and predicting demand. 78 The New Normal in BI 8. Predictive Analytics 41 7 Advanced Analytics Analysis (Why did it happen ?) Traditionally, BI systems provided a retrospective view of the business by querying data warehouses containing historical data. Contrary to this, contemporary BI-systems analyze real-time event streams in memory. Reporting (What happened ?) In today’s rapidly changing business environment, organizational agility not only depends on operational monitoring of how the business is performing but also on the prediction of future outcomes which is critical for a sustainable competitive position. HISTORY Predictive analytics leverages actionable intelligence that can be integrated in operational processes. PRESENT FUTURE Prescriptive Analytics (How can we make it happen!) Monitoring Predictive Analytics (What is happening now ?) (What might happen ?) 42 The goal of all organizations with access to large data collection will be to harness the most relevant data and use it for better decision making Prescriptive analytics Advanced analytics To determine which decision and/or action will produce the most effective result against a specific set of objectives and constraints Predictive analytics Leverage past data to understand why something happened or to predict what will happen in the future across various scenarios Business intelligence Descriptive analytics Mine past data to report, visualize and understand what has already happened – after the fact or in real time Computational complexity How Analytics Builds on Business Intelligence “Analytics are a subset of … business intelligence: a set of technologies and processes that use data to understand business performance … The questions that analytics can answer represent the higher-value and more proactive end of this spectrum” Analytics: The three levels Descriptive Analytics: Classic BI • Quantitative Assessment of Past Business Results • Statistics, Exploratory Data Analysis, Visualization • Key task: Data access / shaping – Power Query does this • Excel + Power Pivot data model holds Past Business Results • Pivot charts, Power View, Power BI for data visualization • Formulas: Sum, Count, Average, Min, Max, Var, StdDev Predictive Analytics • Quantitative Methods to Predict New Outcomes • Forecasting, Prediction, Classification, Association • Key tasks: Data shaping, applying predictive models • Data mining algorithms “fit” analytic model to past data • Trained/fitted models are applied to newly arriving data Classify: ex. Good/Poor credit risk, Likely/Unlikely to churn Predict: ex. stock price, Forecast a time series: ex. next sales from past sales history Associate: ex. People who bought this item also bought... • Tools: Azure ML, XLMiner, Predixion, SAS, SPSS, R, others Prescriptive Analytics • Quantitative Methods to Make Better Decisions • Decision Trees, Monte Carlo Simulation, Optimization Key task: Create a model – A person must do this • Model must capture essential features of the business situation • Larger models often get their data from BI / Descriptive Analytics •A “What If” model is the starting point – Excel is a natural tool! Given an appropriate model, •Ask “What are all the possible outcomes?” – simulation/risk analysis •Ask “What’s the best outcome we can achieve?” – optimization •Tools: Solver, Risk Solver, @RISK, Crystal Ball, IBM, SAS, others Potential growth vs. commitment for analytics options advanc ed analytics (e.g. mining, predictive) data marts for analytics advanc ed data visualization commitment predictive analytic s analytic s processed within EDW enterprise data warehouse (EDW) statistical analysis OLAP tools analytic database outside the EDW hand- coded SQL data mining sc oring real- time reports or dashboards accelerator (hardware or software based) in- database analytic s data warehouse appliance text mining in- memory database DBMS for data warehousing sandboxes for analytics column oriented storage engine visual disc overy private cloud closed- loop processing MapReduce, Hadoop, Complex Event Processing DBMS for transac tion processing mixed workloads in a DW extreme SQL in- line analytics public cloud Software as a Servic e -30 -15 0 15 30 45 potential growth Graphic based on survey results reported in Big Data Analytics, Potential growth is an indicator for the growth or decline of usage for big data analytics over the next three years. Commitment is a cumulative measure representing the percentage of respondens (N= 325) who selected using today and/or using in three years. 46 Current trends affecting predictive analytics : 47 Standards for data mining and model deployment : CRISP-DM • A systematic approach to guide the data mining process has been developed by a consortium of vendor and users of data mining, known as Cross Industry Standard for Data Mining (CRISP-DM). • In the CRISP-DM model, data mining is described as an interative process that is depicted in several phases (business and data understanding, data preparation, modeling, evaluation and deployment) and their respective tasks. Leading vendors of analytical software offer workbenches that make the CRISP-DM process explicit. 48 Standards for data mining and model deployment : PMML • To deliver a measurable ROI, predictive analytics requires a focus on decision optimization to achieve business objectives. A key element to make predictive analytics pervasive is the integration with commercial lines operations. Without disrupting these operations, business users should be able to take advantage of the guidance of predictive models. • For example, in operational environments with frequent customer interactions, high-speed scoring of real-time data is needed to refine recommendations in agent-customer interactions that address specific goals, e.g. improve retention offers. A model deployed for these goals acts as a decision engine by routing the results of predictive analytics to users inthe form of recommendations or action messages. • A major development for the integration of predictive models in business applications is the PMML-standard (Predictive Model Markup Language) that separates the results of data mining from the tools that are used for knowledge discovery. 49 50 PMML represents an open standard for interoperabilityof predictive models. Most development environments can export models in PMML. As analytics increasingly drive business decisions, open standards like PMML facilitate the integration of predictive models into operational systems. The deployment of predictive models in an existing IT-infrastructure no longer depends on custom code or the processing of a proprietary language. Besides the flexible integration of predictive models into business applications, continuous analysis is key to enable business process optimization. The broad acceptance of the PMML-standard further stimulates the exchange of predictive models. Open standards like PMML contribute to the wider adoption of predictive analytics and stimulate collaboration between stakeholders of a business process. In a similar vein, the increased use of open-source software can profit from PMML. Open-source environments can visualize and further refine predictive models that were produced in a different environment. 51 Structured and unstructured data types • The field of advanced analytics is moving towards providing a number of solutions for the handling of big data. Characteristic for the new marketing data is its text-formatted content in unstructured data sources which covers « the consumer’s sphere of influence » : analytics must be able to capture and analyze consumer-initiated communication. • By analyzing growing streams of social media content and sifting through sentiment and behavioral data that emanates from online communities, it is possible to acquire powerful insights into consumer attitudes and behaviour. Social media content gives an instant view of what is taking place in the ecosystem of the organization. Enterprises can leverage insights from social media content to adapt marketing, sales and product strategies in an agile way. • The convergence between social media feeds and analytics also goes beyond the aggregate level. Social network analytics enhance the value of predictive modeling tools and business processes will benefit from new inputs that are deployed. For example, the accuracy and effectiveness of predictive churn analytics can be increased by adding social network information that identifies influential users and the effects of their actions on other group members. 52 Predictive modeling Advanced visualization multidimensional view of data Self-service business discovery in an interactive way Data-as-a-service making multiple data sources available for analysis Social media analytics Text mining Collaboration analyze customer sentiment pattern detection in unstructured data adding context to decision making Real-time dashboards monitor KPI’s 88 Advances in database technology : big data and predictive analytics • As companies gather larger volumes of data, the need for the execution of predictive models becomes more prevalent. • A known practice is to build and test predictive models in a development environment that consists of operational data and warehousing data. In many cases analysts work with a subset of data through sampling. Once developed, a model is copied to a runtime environment where it can be deployed with PMML. A user of an operational application can invoke a stored predictive model by including user defined functions in SQLstatements. This causes the RDBMS to mine the data iself without transferring the data into a separate file. The criteria expressed in a predictive model can be used to score, segment, rank or classifyrecords. • An emerging practice to work with all data and directly deploy predictive models is in-database analytics. For example, Zementis (www.zementis.com) and Greenplum (www.greenplum.com) have joined forces to score huge amounts of data in-parallel. The Universal PMLL Plug-in developed by Zementis is an in-database scoring engine that fully supports the PMML-standard to execute predictive models from commerial and open source data mining tools within the database. 54 Data is partitioned across multiple segment servers and each segment manages a distinct portion of the overall data. The Universal PMML Plug-inenables predictive analytics directly within the Greenplum Database for highperformance scoring in a massively parallel environment. 55 Predictive analytics in the cloud • While vendors implement predictive analytics capabilities into their databases, a similar development is taking place in the cloud. This has an impact on how the cloud can assist businesses to manage business processes more efficiently and effectively. Of particular importance is how cloud computing and SaaS provide an infrastructure for the rapid development of predictive models in combination with open standards. The PMML standard has yet received considerable adoption and combined with a service-oriented archirtecture for the design of loosely coupled systems, the cloud computing/SaaS model offers a cost-effective way to implement predictive models. • As an illustration of how predictive models can be hosted in the cloud, we refer to the ADAPA scoring engine (Adaptive Decision and Predictive Analytics, www.zementis.com). ADAPA is an on demand predictive analytics solution that combines open standarfds and deployment capabilities. The data infrastructure to launch ADAPA in the cloud is provided by Amazon Web Services (www.amazonwebservices.com). Models developed with PMML-compliant software tools (e.g. SAS, Knime, R, ..) can be easily uploaded in theADAPA environment. 56 Since models are developed outside the ADAPA environment, a first step of model deployment consists of a verification step to ensure that both the scoring engine and the model development environment produce the same results. Once verified, models are executed either in batch or in real-tile. Batch processing implies that records are run against a loaded model. After processing, a file with the input and predicted values is available for download. Real-time execution of models in enterprise systems is performed throughWeb services that are the base for interoperability. As new events occur, a request is submitted to the ADAPA engine for processing and the results of predictive modeling are available almost simultaneously. 57 • The on-demand paradigm allows businesses to use sophisticated software applications over the Internet, resulting in a faster time to production with a reduction of total cost of ownership. • Moving predictive analytics into the cloud also accelerates the trend towards self-service BI. The so-called democratization of data implies that data access and analytics should be available across the enterprise. The fact that data volumes are increasing as well as the need for insights from data, reinforce the trend for selfguided analysis. The focus on the latter also stems from the often long development backlogs that users experience in the enterprise context. Contrary to this, cloud computing and Saas enable organizations to make use of solutions that are tailored to specific business problems and complement existingsystems. 58 • • • • PMML represents a common standard for the representation of predictivemodels. PMML eliminates the barriers between model development and modeldeployment. Through PMML predictive models can be embedded directly in adatabase. PMML-models can score data on a massive scale through parallel processing or in the cloud. 59 BI has evolved from performance reporting on historical data to the pervasive use of real-time data from disparate sources. To respond faster to market conditions, a much broader user base needs data access to interactively explore and visualize information sources and share insights to make faster and better informed decisions. In the era of big data, a Web-based platform enables business discovery and data as well as analytics are consumed as services in the cloud. 60 References BOHRINGER, M., GLUCHOWSKI, P., KURZE, Chr. & SCHIEDER, Cgr., A business intelligence perspective on the future Internet, AMCIS 2010 Proceedings, Paper 267. COUTURIER, H., NEIDECKER-LUTZ, B., SCHMIDT, V.A. & WOODS, D., Understanding the future Internet, Evolved Technologist Press, New York, 2011. ECKERSON, W., BI delivery framework 2020, Beye NETWORK, march 2011. GUAZZELLI, A., STATHATOS, K., ZELLER, M., Efficient deployment of predictive analytics through open standards and Cloud computing, SIGKDD Explorations, 11, issue 1, pp. 32-38. HINCHCLIFFE, D., Next-generation ecosystems and its key success factors, Dachis Group, 2011. MICU, A.C., DEDEKER, K., LEWIS, I., MORAN, R., NETZER, O., PLUMMER, J. & RUBINSON, J., The shape of marketing research in 2021, Journal of Advertising Research, 51, march 2011, pp. 213-221. RUSSOM, Ph., Big data analytics, TDWI Best Practices Report, Q4 2011. SINGH KHALSA, R.H., REASON, A., BIERE, M., MEYERS, C., GREGGO, A & DEVINE, M., A convergence in application architectures and new paradigms in computing. SOA, composite applications and cloud computing, IBM, january 2009. SINGH KHALSA, R.H., REASON, A. & BIERE, M., The new era of collaborative business intelligence, IBM, march 2010. 61