Business Intelligence Roadmap Larissa Moss Information Management Magazine, February 1, 2002 Editor's Note: This article is excerpted from the upcoming book, Business Intelligence Roadmap: The Complete Lifecycle, by Shaku Atre and Larissa Moss. ADVERTISEMENT Business intelligence (BI) initiatives are expensive endeavors. They call for new technology to be considered, additional tasks to be performed, roles and responsibilities to be shifted, and applications to be delivered quickly while being of acceptable quality. What is needed is a new methodology. A BI application is an engineering project; and engineering projects of any kind go through six stages between inception and implementation: 1. Justification: An assessment is made of a business problem or a business opportunity, which gives rise to the engineering project. 2. Planning: Strategic and tactical plans are developed, which lay out how the engineering project will be accomplished. 3. Business Analysis: Detailed analysis of the business problem or business opportunity is performed, which provides a solid understanding of the business requirements for a solution. 4. Design: A product is conceived, which solves the business problem or enables the business opportunity. 5. Construction: The conceived product is built, which is expected to provide a return on the development investment within a predefined time frame. 6. Deployment: The finished product is implemented (or sold) and its effectiveness is measured, which will determine whether the solution meets, exceeds or fails the expected return on investment. Old Single-Swim-Lane Development Approach Because the BI environment is a cross-organizational decision support environment, the system development practices of the past are inappropriate. Every system in the past had a beginning and an end; and every system in the past had only one set of users from one line of business. Cross- organizational activities were not deemed to be necessary to solve the isolated problems of a line of business. Not only were they not deemed necessary, but cross-organizational activities were perceived to be in the way of progress because they slowed the projects. For nonintegrated line-of-business system development, the conventional waterfall methodologies are sufficient. They provide enough guidance for planning, building and implementing standalone systems. However, these methodologies don't cover strategic planning, cross- organizational business analysis or selecting new technologies with every project, nor do they embrace the concept of application releases. They typically start with project planning, concentrate on designing and coding, and end in maintenance. Unlike the development of old systems, the development of an integrated BI environment is iterative in nature because such an environment is too large and too complex to be built in one big bang. Data and functionality must be rolled out in releases, with each deployment spiraling into requirements for the next release (see Figure 1). Figure 1: Spiral Methodology New Cross-Organizational Development Approach The expansion of e-business demands cross-organizational integration. This integration does not merely refer to bridging old systems across different platforms; it refers to information integration, information integrity, seamless business functionality and streamlined organizational business processes. No other initiative demonstrates this as vividly as customer relationship management (CRM). Cross-organizational integration requires an enterprise-wide architecture as well as an infrastructure (technical and nontechnical). Enterprise-wide architecture and infrastructure must be considered core competencies. A BI roadmap is an engineering roadmap that provides a framework for BI projects with flexible entry points. This means that an organization can enter the effort at any step in the development cycle, provided it meets certain entry criteria (prerequisites). A BI roadmap also encourages parallel development tracks where multiple steps can be performed simultaneously and multiple activities within the steps can occur at the same time. The roadmap is also designed to be agile and adaptive so that the project can be organized and managed as multiple parallel subprojects, each going through several iterations on its own (i.e., "refactoring") as shown in Figure 2. Figure 2: BI Project Organization BI Development Stages and Steps BI projects go through the same six stages common to every engineering project. Within each engineering stage, certain steps are conducted to see the engineering project through to its completion. A BI roadmap is comprised of sixteen development steps. Justification Stage Step 1: Business Case Assessment. The business problem or business opportunity is defined and a BI solution is proposed. Each BI application release should be costjustified and should clearly define the benefits of either solving a business problem or taking advantage of a business opportunity. Planning Stage Step 2: Enterprise Infrastructure. Because BI is a cross- organizational decision support solution, an enterprise infrastructure must exist or be developed while the BI applications are developed. An enterprise infrastructure has two components: Technical infrastructure which includes hardware, software, middleware, database management systems, operating systems, network components, meta data repository and applications; and Nontechnical infrastructure which includes meta data standards, data naming standards, enterprise data architecture (evolving), methodology, guidelines, testing procedures, change control process, issues management procedures and dispute resolution procedures. Step 3: Project Planning. BI projects are extremely dynamic and changes to scope, staff, budget, technology, users and sponsors can severely impact the success of the project. Therefore, project planning must be detailed, and actual progress must be closely watched and reported. Business Analysis Stage Step 4: Project Delivery Requirements. Scoping is one of the most difficult tasks for BI applications. The desire to have everything instantly is difficult to curtail; however, keeping the scope small is one of the most important aspects to defining the requirements for each deliverable. These requirements should be expected to change throughout the development cycle as more is learned about the possibilities and the limitations of the technology. Step 5: Data Analysis. The biggest challenge to all BI projects is the quality of the source data. The bad habits developed over decades are difficult to break, and it is very difficult and time-consuming to find and correct the damage resulting from the bad habits. In addition, data analysis in the past was confined to one line-of-business user's view and was never reconciled with other views in the organization. This step will take a significant percentage of time in the entire project schedule. Step 6: Application Prototyping. Analysis for the functional deliverable(s), formerly called system analysis, is best done through prototyping. Today there are tools and new programming languages that enable the developers to prove or disprove a concept or idea relatively quickly. Prototyping also allows the users to see the potential and the limits of the technology. This gives them an opportunity to adjust their delivery requirements and their expectations. Step 7: Meta Data Repository Analysis. Having more tools means having more technical meta data in addition to the business meta data, which is usually captured in a modeling CASE (computer-aided software engineering) tool. This meta data needs to be mapped to other meta data and stored in a repository. Meta data repositories can be purchased or built. In either case, the requirements for what type of meta data to capture and store must be documented in a meta model. In addition, the requirements for delivering meta data to the users have to be analyzed. Design Stage Step 8: Meta Data Repository Design. If a meta data repository is purchased, it will most likely have to be extended with features that are required by your BI applications. If a meta data repository is built, the database has to be designed based on the meta model developed during the previous step. Step 9: Database Design. One or more databases will be storing the business data in detailed or aggregated form, depending on the reporting requirements of the users. Not all reporting requirements are strategic, and not all of them are multidimensional. The database design schema must match the access requirements of the business. Step 10: ETL Design. This process is the most complicated process of the entire BI project; it is also the least glamorous. Extract, transform and load (ETL) processing time frames (batch windows) are typically small. Yet, the poor quality of the source data usually mandates a lot of time to run the transformation and cleansing programs. It is a challenge for most organizations to finish the ETL process within the available time frame. Construction Stage Step 11: ETL Development. Many tools are available for this process, some sophisticated and some simple. Depending on the data cleansing and data transformation requirements developed during the data analysis step, an ETL tool may or may not be the best solution. In either case, preprocessing the data and writing extensions to the tool capabilities are frequently required. Step 12: Application Development. Once the prototyping effort has finalized the functional delivery requirements, true development can begin on either the same user access and analysis tools, such as OLAP tools, or on different tools. This activity is usually performed in parallel to the meta data repository and ETL activities. Step 13: Data Mining. Many organizations do not use their BI databases to their fullest capability. In fact, usage is often limited to prewritten reports some of them not even new types of reports, but replacements of old reports. The real payback for BI applications comes from the business intelligence hidden in the organization's data, which can only be discovered with data mining tools. Step 14: Meta Data Repository Development. If the decision is made to build a meta data repository rather than to buy one, a separate team is usually charged with the development process. This becomes a sizable subproject of the overall BI project. Deployment Stage Step 15: Implementation. Once all components of the BI application are thoroughly tested, the BI databases and functions are rolled out. Users must be trained and the support functions initiated. These functions include help desk support, maintenance of the BI target databases, scheduling and running ETL batch jobs, performance monitoring and database tuning. Step 16: Release Evaluation. With an application release concept, it is very important to benefit from lessons learned on the previous project. Any tools, techniques, guidelines and processes that were not helpful should be reevaluated and adjusted, possibly even discarded. Any missed deadlines, cost overruns, disputes and their resolutions should be examined. Adjustments to the processes should be made before the next release. The development steps need not be performed in sequence; most likely, they will be performed in parallel. However, because there is a natural order of progression from one engineering stage to another, certain dependencies exist between some of the development steps as illustrated in Figure 3. Steps stacked on top of each other in the diagram can be performed simultaneously, while steps following each other should be performed linearly because of their dependencies. Figure 3: BI Roadmap Stages and Steps Parallel Development Tracks Most BI projects have at least three development tracks running in parallel once the project delivery requirements have been defined: 1. ETL Track also known as Back-End. The design and population of the BI target databases are the most important components of BI projects. The fanciest OLAP tools in the world will not work if the databases are not designed properly or if they are populated with dirty data. 2. Application Track also known as Front-End. Value-added data delivery from the BI databases as well as easy ad hoc (spontaneous) access to the business data are the key reasons for building the BI environment. 3. Meta Data Repository Track. Meta data is a deliverable for every BI application. It can no longer be shoved aside as documentation. It must serve the users as a navigation tool for the target databases in the BI environment. Figure 4 shows the participation of the three development teams across the stages and steps. Figure 4: BI Roadmap Steps by Development Tracks These three tracks can be considered major subprojects of the BI project. Each will have its own team and set of activities after the project delivery requirements have been formalized. Discoveries made on one track can, and often do, impact the other tracks. Figure 5 shows the steps of the different tracks by color. Each development track has a specific deliverable which contributes to the BI project objectives: The ETL track will deliver loaded databases. The application track will deliver the reports, queries and ad hoc tools. The meta data repository will deliver the meta data. Each track moves through the six engineering stages either together or apart and in parallel, performing the engineering activities in their specific steps. Figure 5: Parallel Development Track Steps Justification for Using a BI Roadmap A wise person once remarked that "a paper airplane can be constructed with little forethought, but a jet airplane cannot." Similarly, a small stovepipe application with only a handful of users can get by without a set of carefully planned and executed activities, but a BI application certainly cannot. As the BI environment evolves into a complicated cross-organizational decision support environment over time, it is essential that a strong foundation exists to support such expansion. Many things have to be considered and many tasks have to be performed by many people to build the strong foundation. To casually construct a plan along the way is irresponsible as this puts the organization's large investment at risk. The question is not whether a methodology must be used, but rather what type of methodology should be used and how to use it most effectively. A traditional waterfall methodology is not suitable for the iterative-release concept of BI applications, but a BI roadmap is. Larissa Moss is founder and president of Method Focus Inc., a company specializing in improving the quality of business information systems. She has more than 20 years of IT experience with information asset management. Moss is coauthor of three books: Data Warehouse Project Management (Addison-Wesley, 2000), Impossible Data Warehouse Situations (Addison-Wesley, 2002) and Business Intelligence Roadmap: The Complete Project Lifecycle for Decision- Support Applications (Addison-Wesley, 2003). Moss can be reached atmethodfocus@earthlink.net. For more information on related topics, visit the following channels: Business Intelligence (BI)