13 Chapter 13 The Data Warehouse Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel 1 13 In this chapter, you will learn: • How operational data and decision support data differ • What a data warehouse is, how data for it are prepared, and how it is implemented • What star schemas are and how they are constructed • What data mining is and what role it plays in decision support Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 2 13 In this chapter, you will learn (continued): • What online analytical processing (OLAP) is • How SQL extensions are used to support OLAP-type data manipulations Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 3 13 The Need for Data Analysis • Managers must be able to track daily transactions to evaluate how the business is performing • By tapping into operational database, management can develop strategies to meet organizational goals • Data analysis can provide information about short-term tactical evaluations and strategies Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 4 13 The Need for Data Analysis (continued) Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 5 13 The Need for Data Analysis (continued) Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 6 13 Decision Support Systems • Decision support is methodology (or series of methodologies) designed to extract information from data and to use such information as a basis for decision making • Decision support system (DSS) – Arrangement of computerized tools used to assist managerial decision making within business – Usually requires extensive data “massaging” to produce information – Used at all levels within organization – Often tailored to focus on specific business areas – Provides ad hoc query tools to retrieve data and to display data in different formats Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 7 13 Decision Support Systems (continued) • Composed of following four main components: – Data store component • Basically a DSS database – Data extraction and data filtering component • Used to extract and validate data taken from operational database and external data sources – End-user query tool • Used to create queries that access database – End-user presentation tool • Used to organize and present data Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 8 13 Decision Support Systems (continued) Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 9 13 Operational Data vs. Decision Support Data • Operational Data – Mostly stored in relational database – Optimized to support transactions representing daily operations • DSS Data – Give tactical and strategic business meaning to operational data – Differs from operational data in following three main areas: • Timespan • Granularity • Dimensionality Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 10 13 Operational Data vs. Decision Support Data (continued) Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 11 13 Operational Data vs. Decision Support Data (continued) Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 12 13 DSS Database Requirements • A specialized DBMS tailored to provide fast answers to complex queries. • Four main requirements: – – – – Database schema Data extraction and loading End-user analytical interface Database size Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 13 13 DSS Database Requirements (continued) • Database schema – Must support complex data representations – Must contain aggregated and summarized data – Queries must be able to extract multidimensional time slices Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 14 13 DSS Database Requirements (continued) Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 15 13 DSS Database Requirements (continued) Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 16 13 DSS Database Requirements (continued) • Data extraction – Should allow batch and scheduled data extraction – Should support different data sources • Flat files • Hierarchical, network, and relational databases • Multiple vendors • Data filtering – Must allow checking for inconsistent data Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 17 13 DSS Database Requirements (continued) • End-user analytical interface – One of most critical DSS DBMS components – Permits user to navigate through data to simplify and accelerate decision-making process Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 18 13 DSS Database Requirements (continued) • Database size – In 2005, Wal-Mart had 260 terabytes of data in its data warehouses – DBMS must support very large databases (VLDBs) Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 19 13 The Data Warehouse • Integrated, subject-oriented, time-variant, nonvolatile collection of data that provides support for decision making • Usually a read-only database optimized for data analysis and query processing • Requires time, money, and considerable managerial effort to create Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 20 13 The Data Warehouse (continued) Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 21 13 The Data Warehouse (continued) Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 22 13 The Data Warehouse (continued) • Data mart – Small, single-subject data warehouse subset – Each is more manageable data set than data warehouse – Provides decision support to small group of people – Typically lower cost and lower implementation time than data warehouse Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 23 13 DSS Architectural Styles • Several are available • Provide advanced decision support features • Some capable of providing access to multidimensional data analysis Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 24 13 DSS Architectural Styles (continued) Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 25 13 Twelve Rules that Define a Data Warehouse • Data warehouse and operational environments are separated • Data warehouse data are integrated • Data warehouse contains historical data over long time horizon • Data warehouse data are snapshot data captured at given point in time • Data warehouse data are subject oriented Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 26 13 Twelve Rules that Define a Data Warehouse (continued) • Data warehouse data are mainly read-only with periodic batch updates from operational data – No online updates allowed • Data warehouse development life cycle differs from classical systems development • Data warehouse contains data with several levels of detail: current detail data, old detail data, lightly summarized data, and highly summarized data • Data warehouse environment is characterized by read-only transactions to very large data sets Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 27 13 Twelve Rules that Define a Data Warehouse (continued) • Data warehouse environment has system that traces data sources, transformations, and storage • Data warehouse’s metadata are critical component of this environment • Data warehouse contains chargeback mechanism for resource usage that enforces optimal use of data by end users Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 28 13 Online Analytical Processing • Advanced data analysis environment that supports decision making, business modeling, and operations research • OLAP systems share four main characteristics: – – – – Use multidimensional data analysis techniques Provide advanced database support Provide easy-to-use end-user interfaces Support client/server architecture Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 29 13 Multidimensional Data Analysis Techniques • Data are processed and viewed as part of a multidimensional structure • Particularly attractive to business decision makers Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 30 13 Multidimensional Data Analysis Techniques (continued) • Augmented by following functions: – Advanced data presentation functions – Advanced data aggregation, consolidation and classification functions – Advanced computational functions – Advanced data modeling functions Database Systems: Design, Implementation, & Management, 7th Edition, Rob & Coronel 31